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Preface to Volume 2 of the Fourth 
Edition 


The main focus of the second volume of this fourth edition, as in the third, is 
on the two non-Abelian quantum gauge field theories of the Standard Model 
— that is, QCD and the electroweak theory of Glashow, Salam and Weinberg. 
We preserve the same division into four parts: non-Abelian symmetries, both 
global and local; QCD and the renormalization group; spontaneously broken 
symmetry; and weak interaction phenomenology and the electroweak theory. 

However, the book has always combined theoretical development with dis- 
cussion of relevant experimental results. And it is on the experimental side 
that most progress has been made in the ten years since the third edition 
appeared - first of all, in the study of CP violation in B-meson physics, and 
in neutrino oscillations. The inclusion of these results, and the increasing im- 
portance of the topics, have required some reorganization, and a new chapter 
(21) devoted wholly to them. We concentrate mainly on CP-violation in B- 
meson decays, particularly on the determination of the angles of the unitarity 
triangle from B-meson oscillations. CP-violation in K-meson systems is also 
discussed. In the neutrino sector, we describe some of the principal experi- 
ments which have led to our current knowledge of the mass-squared differences 
and the mixing angles. In discussing weak interaction phenomenology, we keep 
in view the possibility that neutrinos may turn out to be Majorana particles, 
an outcome for which we have prepared the reader in (new) chapters 4 and 7 
of volume 1. 

More recently, on July 4, 2012, the ATLAS and CMS collaborations at 
the CERN LHC announced the discovery of a boson of mass between 125 and 
126 GeV, with production and decay characteristics which are consistent (at 
the lo level) with those of the Standard Model Higgs boson. We can now 
conclude our treatment of the electroweak theory, and this volume, with a 
discussion of this historic discovery, which opens a new era in particle physics 
— one in which the electroweak symmetry-breaking (Higgs) sector of the SM 
will be rigorously tested. 

Our treatment of a number of topics has been updated and, we hope, im- 
proved. In QCD, the definition of 2-jet cross sections in e*e^ annihilation 
is explained, and used in a short discussion of jet algorithms (sections 14.5 
and 14.6). Progress in lattice QCD is recognized with the inclusion of some 
of the recent impressive results using dynamical fermions (section 16.5). In 
the chapter on chiral symmetry breaking, a new section (18.3) introduces the 


xiii 


xiv Preface 


important technique of effective Lagrangians, including the extension to the 
three-flavour case and the associated mass relations. A much fuller account is 
given of three-generation quark mixing and the CKM matrix (section 20.7.3), 
as preparation for chapter 21. The essential points in chapter 21 of the pre- 
vious edition, relating to problems with the current-current and IVB models, 
now provide the introductory motivation for the GSW theory in chapter 22. 

One item has been banished to an appendix: geometrical aspects of gauge 
theories, which did after all seem to interrupt the flow of chapter 13 too much 
(but we hope readers will not ignore it). And another has been brought in 
from the cold: as already mentioned, Majorana fermions now find themselves 
appearing for the first time in volume 1. 
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Global Non-Abelian Symmetries 


12.1 The Standard Model 


In the preceding volume, a very successful dynamical theory — QED - has been 
introduced, based on the remarkably simple gauge principle: namely, that the 
theory should be invariant under local phase transformations on the wave- 
functions (chapter 2) or field operators (chapter 7) of charged particles. Such 
transformations were characterized as Abelian in section 2.6, since the phase 
factors commuted. The second volume of this book will be largely concerned 
with the formulation and elementary application of the remaining two dynam- 
ical theories within the Standard Model - that is, QCD and the electroweak 
theory. They are built on a generalization of the gauge principle, in which the 
transformations involve more than one state, or field, at a time. In that case, 
the ‘phase factors’ become matrices, which generally do not commute with 
each other, and the associated symmetry is called a ‘non-Abelian’ one. When 
the phase factors are independent of the space-time coordinate x, the symme- 
try is a ‘global non-Abelian’ one; when they are allowed to depend on z, one 
is led to a non-Abelian gauge theory. Both QCD and the electroweak theory 
are of the latter type, providing generalizations of the Abelian U(1) gauge 
theory which is QED. It is a striking fact that all three dynamical theories in 
the Standard Model are based on a gauge principle of local phase invariance. 

In this chapter we shall be mainly concerned with two global non-Abelian 
symmetries, which lead to useful conservation laws but not to any specific 
dynamical theory. We begin in section 12.1 with the first non-Abelian sym- 
metry to be used in particle physics, the hadronic isospin ‘SU(2) symmetry’ 
proposed by Heisenberg (1932) in the context of nuclear physics, and now 
understood as following from QCD and the smallness of the u and d quark 
masses as compared with the QCD scale parameter Ayg (see section 18.3.3). 
In section 12.2 we extend this to SU(3); flavour symmetry, as was first done 
by Gell-Mann (1961) and Ne’eman (1961) — an extension seen, in its turn, as 
reflecting the smallness of the u, d and s quark masses as compared with Ays- 
The ‘wavefunction’ approach of sections 12.1 and 12.2 is then reformulated in 
field-theoretic language in section 12.3. 

In the last section of this chapter, we shall introduce the idea of a global 
chiral symmetry, which is a symmetry of theories with massless fermions. This 
may be expected to be a good approximate symmetry for the u and d quarks. 
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But the anticipated observable consequences of this symmetry (for example, 
nucleon parity doublets) appear to be absent. This puzzle will be resolved 
in Part VII, via the profoundly important concept of ‘spontaneous symmetry 
breaking’. 

The formalism introduced in this chapter for SU(2) and SU(3) will be 
required again in the following one, when we consider the local versions of 
these non-Abelian symmetries and the associated dynamical gauge theories. 
The whole modern development of non-Abelian gauge theories began with 
the attempt by Yang and Mills (1954) (see also Shaw 1955) to make hadronic 
isospin into a local symmetry. However, the beautiful formalism developed 
by these authors turned out not to describe interactions between hadrons. 
Instead, it describes the interactions between the constituents of the hadrons, 
namely quarks — and this in two respects. First, a local SU(3) symmetry 
(called SU(3).) governs the strong interactions of quarks, binding them into 
hadrons (see Part VI). Secondly, a local SU(2) symmetry (called weak isospin) 
governs the weak interactions of quarks (and leptons); together with QED, this 
constitutes the electroweak theory (see Part VIII). It is important to realize 
that, despite the fact that each of these two local symmetries is based on 
the same group as one of the earlier global (flavour) symmetries, the physics 
involved is completely different. In the case of the strong quark interactions, 
the SU(3). group refers to a new degree of freedom (‘colour’) which is quite 
distinct from flavour u, d, s (see chapter 14). In the weak interaction case, 
since the group is an SU(2), it is natural to use ‘isospin language’ in talking 
about it, particularly since flavour degrees of freedom are involved. But we 
must always remember that it is weak isospin, which (as we shall see in chapter 
20) is an attribute of leptons as well as of quarks, and hence physically quite 
distinct from hadronic isospin. Furthermore, it is a parity-violating chiral 
gauge theory. 

Despite the attractive conceptual unity associated with the gauge prin- 
ciple, the way in which each of QCD and the electroweak theory ‘works’ is 
actually quite different from QED, and from each other. Indeed it is worth 
emphasizing very strongly that it is, a priori, far from obvious why either the 
strong interactions between quarks, or the weak interactions, should have any- 
thing to do with gauge theories at all. Just as in the U(1) (electromagnetic) 
case, gauge invariance forbids a mass term in the Lagrangian for non-Abelian 
gauge fields, as we shall see in chapter 13. Thus it would seem that gauge 
field quanta are necessarily massless. But this, in turn, would imply that the 
associated forces must have a long-range (Coulombic) part, due to exchange of 
these massless quanta — and of course in neither the strong nor the weak inter- 
action case is that what is observed.! As regards the former, the gluon quanta 
are indeed massless, but the contradiction is resolved by non-perturbative ef- 
fects which lead to confinement, as we indicated in chapter 1. We shall discuss 


1Pauli had independently developed the theory of non-Abelian gauge fields during 1953, 
but did not publish any of this work because of the seeming physical irrelevancy associated 
with the masslessness problem (Enz 2002, pages 474-82; Pais 2000, pages 242-5). 
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this further in chapter 16. In weak interactions, a third realization appears: 
the gauge quanta acquire mass via (it is believed) a second instance of spon- 
taneous symmetry breaking, as will be explained in Part VII. In fact a further 
application of this idea is required in the electroweak theory, because of the 
chiral nature of the gauge symmetry in this case: the quark and lepton masses 
also must be ‘spontaneously generated’. 


(MU 
12.2 The flavour symmetry SU(2)¢ 
12.2.4 The nucleon isospin doublet and the group SU(2) 


'The transformations initially considered in connection with the gauge principle 
in section 2.5 were just global phase transformations on a single wavefunction 


y! = ly. (12.1) 
The generalization to non-Abelian invariances comes when we take the sim- 
ple step — but one with many ramifications — of considering more than one 
wavefunction, or state, at a time. Quite generally in quantum mechanics, we 
know that whenever we have a set of states which are degenerate in energy (or 
mass) there is no unique way of specifying the states: any linear combination 
of some initially chosen set of states will do just as well, provided the normal- 
ization conditions on the states are still satisfied. Consider, for example, the 
simplest case of just two such states — to be specific, the neutron and proton 
(figure 12.1). This single near coincidence of the masses was enough to suggest 
to Heisenberg (1932) that, as far as the strong nuclear forces were concerned 
(electromagnetism being negligible by comparison), the two states could be 
regarded as truly degenerate, so that any arbitrary linear combination of neu- 
tron and proton wavefunctions would be entirely equivalent, as far as this 
force was concerned, for a single ‘neutron’ or single ‘proton’ wavefunction. 
This hypothesis became known as ‘charge independence of nuclear forces’. 
Thus redefinitions of neutron and proton wavefunctions could be allowed, of 
the form 
Vy > V = ab + Bis (12.2) 
Yn — Pa = Vp + On (12.3) 
for complex coefficients a, 6, y, and à. In particular, since Yp and Yn are 
degenerate, we have 


Hio = Ep, Hyn = Ets (12.4) 
from which it follows that 


Hy, = Hap + Bln) = aH + BH Yn (12.5) 
= Eladp + BV.) = Ev, (12.6) 
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FIGURE 12.1 
Early evidence for isospin symmetry. 


and similarly 
Hy, = Ey, (12.7) 


showing that the redefined wavefunctions still describe two states with the 
same energy degeneracy. 

me two-fold degeneracy seen in figure 12.1 is suggestive of that found in 
spin-2 systems in the absence of any magnetic field; the s; — +4 components 
are Sescierace: The analogy can be brought out by introducing the two- 
component nucleon isospinor 


por) = -( i ) = Upxp + VaXn (12.8) 


yalah asla) (12.9) 


In 0/2, Wp is the amplitude for the nucleon to have ‘isospin up’, and v is 
that for it to have ‘isospin down’. 

As far as the states are concerned, this terminology arises, of course, from 
the formal identity between the ‘isospinors’ of (12.9) and the two-component 
eigenvectors (3.60) corresponding to eigenvalues tih of (true) spin: compare 
also (3.61) and (12.8). It is important to be clear, however, that the degrees of 
freedom involved in the two cases are quite distinct; in pas uen even though 
both the proton and the neutron have (true) spin—5, the transformations 
(12.2) and (12.3) leave the (true) spin part of their seein ions completely 
untouched. Indeed, we are suppressing the spinor part of both wavefunctions 
altogether (they are of course 4-component Dirac spinors). As we proceed, 
the precise mathematical nature of this ‘spin-1/2’ analogy will become clear. 

Equations (12.2) and (12.3) can be compactly written in terms of w!/?) 
as 


where 


yp? 4 0/2 <vytD, ve ( E : ) (12.10) 
where V is the indicated complex 2 x 2 matrix. Heisenberg's proposal, then, 
was that the physics of strong interactions between nucleons remained the 
same under the transformation (12.10): in other words, a symmetry was in- 
volved. We must emphasise that such a symmetry can only be exact in the 
absence of electromagnetic interactions: it is therefore an intrinsically approx- 
imate symmetry, though presumably quite a useful one in view of the relative 
weakness of electromagnetic interactions as compared to hadronic ones. 
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We now consider the general form of the matrix V, as constrained by 
various relevant restrictions: quite remarkably, we shall discover that (after 
extracting an overall phase) V has essentially the same mathematical form 
as the matrix U of (4.33), which we encountered in the discussion of the 
transformation of (real) spin wavefunctions under rotations of the (real) space 
axes. It will be instructive to see how the present discussion leads to the same 
form (4.33). 

We first note that V of (12.10) depends on four arbitrary complex numbers, 
or alternatively on eight real parameters. By contrast, the matrix U of (4.33) 
depends on only three real parameters, which we may think of in terms of two 
to describe the direction of the axis of rotation, and a third for the angle of 
rotation. However, V is subject to certain restrictions, and these reduce the 
number of free parameters in V to three, as we now discuss. First, in order 
to preserve the normalization of y/? we require 


yQO/2 1540/2! = ydy yY = pA/tyG/2) (12.11) 
which implies that V has to be unitary: 
VIV — 15, (12.12) 


where 15 is the unit 2 x 2 matrix. Clearly this unitarity property is in no 
way restricted to the case of two states: the transformation coefficients for 
n degenerate states will form the entries of an n x n unitary matrix. A 
trivialization is the case n — 1, for which, as we noted in section 2.6, V reduces 
to a single phase factor as in (12.1), indicating how all the previous work is 
going to be contained as a special case of these more general transformations. 
Indeed, from elementary properties of determinants we have 


det V! V = det V! - det V = detV* - detV =| detV |?= 1 (12.13) 


so that 
detV = exp(i0) (12.14) 


where 0 is a real number. We can separate off such an overall phase factor from 
the transformations mixing ‘p’ and ‘n’, because it corresponds to a rotation 
of the phase of both p and n wavefunctions by the same amount: 


V, = ey, Ya = eyn. (12.15) 


The V corresponding to (12.15) is V = et% 12, which has determinant exp(2io) 
and is therefore of the form (12.1) with 0 = 2a. In the field-theoretic formalism 
of section 7.2, such a symmetry can be shown to lead to the conservation of 
baryon number Ni + Na — Na — Na, where bar denotes the antiparticle. 

The new physics will lie in the remaining transformations which satisfy 


detV = +1. (12.16) 
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Such a matrix is said to be a special unitary matrix, which simply means it 
has unit determinant. Thus, finally, the V’s we are dealing with are special, 
unitary, 2 x 2 matrices. The set of all such matrices form a group. The 
general defining properties of a group are given in appendix M. In the present 
case, the elements of the group are all such 2 x 2 matrices, and the ‘law of 
combination’ is just ordinary matrix multiplication. It is straightforward to 
verify (problem 12.1) that all the defining properties are satisfied here; the 
group is called ‘SU(2)’, the ‘S’ standing for ‘special’, the ‘U’ for ‘unitary’, and 
the ‘2’ for ‘2 x 2’. 

SU(2) is actually an example of a Lie group (see appendix M). Such groups 
have the important property that their physical consequences may be found 
by considering ‘infinitesimal’ transformations, that is — in this case — matrices 
V which differ only slightly from the ‘no-change’ situation corresponding to 
V = 12. For such an infinitesimal SU(2) matrix Ving we may therefore write 


Vina = 12 + i£ (12.17) 
where € is a 2 x 2 matrix whose entries are all first-order small quantities. The 
condition detV;,g = 1 now reduces, on neglect of second-order terms 0(£?), 
to the condition (see problem 12.2) 

Tré — 0. (12.18) 
The condition that Ving be unitary, i.e. 
(15 + i£)(15 — i£!) = 15 (12.19) 
similarly reduces (in first order) to the condition 
£—t€!. (12.20) 


Thus € is a 2 x 2 traceless Hermitian matrix, which means it must have the 


form 
¿= ( gu i Ji (12.21) 
where a,b,c are infinitesimal real parameters. Writing 
a=e3/2, b=«/2, c=e/2, (12.22) 
(12.21) can be put in the more suggestive form 
€=e-7/2 (12.23) 
where e stands for the three real quantities 


E= (€1, €2, €3) (12.24) 
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which are all first-order small. The three matrices Tr are just the familiar 
Hermitian Pauli matrices 


0 1 0 -i 1 0 
aeli JE à )s-(à s (12.25) 


here called ‘tau’ precisely in order to distinguish them from the mathemati- 
cally identical ‘sigma’ matrices which are associated with the real spin degree 
of freedom. Hence a general infinitesimal SU(2) matrix takes the form 


Vina = (12 +ie- 7/2), (12.26) 


and an infinitesimal SU(2) transformation of the p-n doublet is specified by 


( ki ) zitiert ( | (12.27) 


The 7-matrices clearly play an important role, since they determine the 
forms of the three independent infinitesimal SU(2) transformations. They are 
called the generators of infinitesimal SU(2) transformations; more precisely, 
the matrices 7/2 provide a particular matriz representation of the generators, 
namely the two-dimensional, or ‘fundamental’ one (see appendix M). We note 
that they do not commute amongst themselves: rather, introducing TQ) = 
7/2, we find (see problem 12.3) 


ee eerie (12.28) 


where i,j and k run from 1 to 3, and a sum on the repeated index k is 
understood as usual. The reader will recognize the commutation relations 
(12.28) as being precisely the same as those of angular momentum operators 
in quantum mechanics: 


(Ji, Jj] = lCijk Jk. (12.29) 


In that case, the choice J; = o;/2 = JO would correspond to a (real) spin- 
1/2 system. Here the identity between the tau's and the sigma's gives us a 
good reason to regard our ‘p-n’ system as formally analogous to a ‘spin-1/2’ 
one. Of course, the ‘analogy’ was made into a mathematical identity by the 
judicious way in which € was parametrised in (12.23). 
The form for a finite SU(2) transformation V may then be obtained from 
the infinitesimal form using the result 
e^ = lim (1+ A/n)” (12.30) 
noo 
generalized to matrices. Let € = a/n, where œ = (a1, 42,3) are three real 
finite (not infinitesimal) parameters, apply the infinitesimal transformation n 
times, and let n tend to infinity. We obtain 


V =exp(ia: 7/2) (12.31) 
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so that 


P / 
pay = ( ) = exp(ia - 7/2) ( i ) = exp(ia - 7/2) 0/2. (12.32) 
Note that in the finite transformation, the generators appear in the exponent. 
Indeed, (12.31) has the form 


V — exp(iG) (12.33) 
where G = a- 7/2, from which the unitary property of V easily follows: 
V = exp(-iG!) = exp(-iG) = V^! (12.34) 


where we used the Hermiticity of the tau’s. Equation (12.33) has the general 
form 
unitary matrix = exp(i Hermitian matrix) (12.35) 


where the ‘Hermitian matrix’ is composed of the generators and the trans- 
formation parameters. We shall meet generalizations of this structure in the 
following sub-section for SU(2), again in section 12.2 for SU(3), and a field 
theoretic version of it in section 12.3. 

As promised, (12.32) has essentially the same mathematical form as (4.33). 
In each case, three real parameters appear. In (4.33) they describe the axis 
and angle of a physical rotation in real three-dimensional space: we can always 
write @ = |o|& and identify |a| with the angle 0 and & with the axis n of 
the rotation. In (12.32) there are just the three parameters in a.? 

In the form (12.32), it is clear that our 2 x 2 isospin transformation is a 
generalization of the global phase transformation of (12.1), except that: 


(i) there are now three ‘phase angles’ a; 


(ii) there are non-commuting matrix operators (the 7’s) appearing in the ex- 
ponent. 


The last fact is the reason for the description ‘non-Abelian’ phase invariance. 
As the commutation relations for the 7 matrices show, SU(2) is a non-Abelian 
group in that two SU(2) transformations do not in general commute. By con- 
trast, in the case of electric charge or particle number, successive transforma- 
tions clearly commute: this corresponds to an Abelian phase invariance and, 
as noted in section 2.6, to an Abelian U(1) group. 

We may now put our initial ‘spin-1/2’ analogy on a more precise mathe- 
matical footing. In quantum mechanics, states within a degenerate multiplet 
may conveniently be characterized by the eigenvalues of a complete set of Her- 
mitian operators which commute with the Hamiltonian and with each other. 


?]t is not obvious that the general SU(2) matrix can be parametrized by an angle 0 
with 0 < 0 € 27r, and f: for further discussion of the relation between SU(2) and the 
three-dimensional rotation group, see appendix M, section M.7. 
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In the case of the p-n doublet, it is easy to see what these operators are. We 
may write (12.4), (12.6) and (12.7) as 


Hay 0/2 = gy Q0 (12.36) 


and ; f 
Hyp = EYD, (12.37) 


where H» is the 2 x 2 matrix 


Hr ( e M i: (12.38) 


Hence Hə is proportional to the unit matrix in this two-dimensional space, 
and it therefore commutes with the tau’s: 


[H2,7] = 0. (12.39) 
It then also follows that H5 commutes with V, or equivalently 
VHV! = Hy (12.40) 


which is the statement that Hə is invariant under the transformation (12.32). 
Now the tau’s are Hermitian, and hence correspond to possible observables. 
Equation (12.39) implies that their eigenvalues are constants of the motion 
(i.e. conserved quantities), associated with the invariance (12.40). But the 
tau’s do not commute amongst themselves and so according to the general 
principles of quantum mechanics we cannot give definite values to more than 
one of them at a time. The problem of finding a classification of the states 
which makes the maximum use of (12.39), given the commutation relations 
(12.28), is easily solved by making use of the formal identity between the 
operators 7;/2 and angular momentum operators J; (cf (12.29)). The answer 
is? that the total squared ‘spin’ 


13751 
(T0/2)2 _ (57) e iU Hra +T) = 71s (12.41) 


and one component of spin, say qu PM 473, can be given definite values 
simultaneously. The corresponding eigenfunctions are just the yp’s and xn’s 
of (12.9), which satisfy 


1 3 1 1 

j^ Xp = 7X aT 3X = 5X» (12.42) 
ts 3 1 1 

Sa y= TA imt. 12.4 
IT X =7X 373X 5X (12.43) 


The reason for the ‘spin’ part of the name ‘isospin’ should by now be clear; 
the term is actually a shortened version of the historical one ‘isotopic spin’. 


3See for example Mandl (1992). 
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In concluding this section we remark that, in this two-dimensional n-p 
space, the electromagnetric charge operator is represented by the matrix 


1 0 1 
It is clear that although Qem commutes with 73, it does not commute with 
either 7; or T2. Thus, as we would expect, electromagnetic corrections to the 
strong interaction Hamiltonian will violate SU(2) symmetry. 


12.2.2 Larger (higher-dimensional) multiplets of SU(2) in 
nuclear physics 


For the single nucleon states considered so far, the foregoing is really nothing 
more than the general quantum mechanics of a two-state system, phrased in 
‘spin-1/2’ language. The real power of the isospin (SU(2)) symmetry concept 
becomes more apparent when we consider states of several nucleons. For 
A nucleons in the nucleus, we introduce three ‘total isospin operators’ T = 
(Th, T5, T3) via 
1 1 1 
TS 37) RE 57 (2) +...+ 37 (A) (12.45) 
which are Hermitian. Here T(n) is the r-matrix for the nth nucleon. The 
Hamiltonian H describing the strong interactions of this system is presumed 
to be invariant under the transformation (12.40) for all the nucleons indepen- 
dently. It then follows that 
[H, T] =0. (12.46) 


'Thus the eigenvalues of the T operators are constants of the motion. Further, 
since the isospin operators for different nucleons commute with each other 
(they are quite independent), the commutation relations (12.28) for each of 
the individual 7’s imply (see problem 12.4) that the components of T defined 
by (12.45) satisfy the commutation relations 


(Ti, 75] = iei TX (12.47) 


for i,j,k = 1,2,3, which are simply the standard angular momentum com- 
mutation relations, once more. Thus the energy levels of nuclei ought to be 
characterized — after allowance for electromagnetic effects, and correcting for 
the slight neutron-proton mass difference — by the eigenvalues of T? and T5, 
say, which can be simultaneously diagonalized along with H. These eigenval- 
ues should then be, to a good approximation, ‘good quantum numbers’ for 
nuclei, if the assumed isospin invariance is true. 

What are the possible eigenvalues? We know that the T’s are Hermitian 
and satisfy exactly the same commutation relations (12.47) as the angular 
momentum operators. These conditions are all that are needed to show that 


the eigenvalues of T? are of the form T'(T +1), where T = 0, i. 1,..., and that 
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FIGURE 12.2 

Energy levels (adjusted for Coulomb energy and neutron-proton mass differ- 
ences) of nuclei of the same mass number but different charge, showing (a) 
‘mirror’ doublets, (b) triplets and (c) doublets and quartets. 


for a given T the eigenvalues of T3 are —T', -T 4-1,...,1 — 1, T; that is, there 
are 2T + 1 degenerate states for a given T. These states all have the same 
A value, and since T3 counts +4 for every proton and —4 for every neutron, 
it is clear that successive values of T3 correspond physically to changing one 
neutron into a proton or vice versa. Thus we expect to see ‘charge multiplets’ 
of levels in neighbouring nuclear isobars. These are indeed observed; figure 
12.2 shows some examples. These level schemes (which have been adjusted 
for Coulomb energy differences, and for the neutron-proton mass difference), 
provide clear evidence of T = $ (doublet), T = 1 (triplet) and T = 3 (quartet) 
multiplets. It is important to note that states in the same 7'-multiplet must 
have the same J” quantum numbers (these are indicated on the levels for 
18 P): obviously the nuclear forces will depend on the space and spin degrees of 
freedom of the nucleons, and will only be the same between different nucleons 
if the space-spin part of the wavefunction is the same. 
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Thus the assumed invariance of the nucleon-nucleon force produces a richer 
nuclear multiplet structure, going beyond the original n-p doublet. These 
higher-dimensional multiplets (T' = 1, 3, ...) are called ‘irreducible represen- 
tations’ of SU(2). The commutation relations (12.47) are called the Lie algebra 
of SU(2)* (see appendix M), and the general group theoretical problem of un- 
derstanding all possible multiplets for SU(2) is equivalent to the problem of 
finding matrices which satisfy these commutation relations. These are, in fact, 
precisely the angular momentum matrices of dimension (2T + 1) x (2T + 1) 
which are generalizations of the 7 /2's, which themselves correspond to T = i. 
as indicated in the notation T(2). For example, the T = 1 matrices are 3 x 3 
and can be compactly summarised by (problem 12.5) 

a) : 
(T; )jk = —l€ijk (12.48) 


a 


where the numbers —ie;;; are deliberately chosen to be the same numbers 
(with a minus sign) that specify the algebra in (12.47); the latter are called the 
structure constants of the SU(2) group (see appendix M, sections M.3-M.5). In 
general there will be matrices T(? of dimensionality (2T +1) x (2T 2-1) which 
satisfy (12.47), and correspondingly (2T + 1)-dimensional wavefunctions y ®) 
analogous to the two-dimensional (T = 4) case of (12.8). The generalization 
of (12.32) to these higher-dimensional multiplets is then 


y = exp(ia - TO CD, (12.49) 


which has the general form of (12.35). In this case, the matrices T? provide 
a (2T + 1)-dimensional matrix representation of the generators of SU(2). We 
shall meet field-theoretic representations of the generators in section 12.3. 

We now proceed to consider isospin in our primary area of interest, which 
is particle physics. 


12.2.3 Isospin in particle physics: flavour SU(2); 


The neutron and proton states themselves are actually only the ground states 
of a whole series of corresponding B = 1 levels with isospin 4 (i.e. doublets). 
Another series of baryonic levels comes in four charge states, corresponding 
toT — ài and in the meson sector, the 7’s appear as the lowest states of a 
sequence of mesonic triplets (T = 1). Many other examples also exist, but 
with one remarkable difference as compared to the nuclear physics case: no 
baryon states are known with T' > 3, nor any meson states with T > 1. 

The most natural interpretation of these facts is that the observed states 
are composites of more basic entities which carry different charges but are 
nearly degenerate in mass, while the forces between these entities are charge- 


independent, just as in the nuclear (p,n) case. These entities are, of course, 


4Likewise, the angular momentum commutation relations (12.29) are the Lie algebra of 
the rotation group SO(3). The Lie algebras of the two groups are therefore the same. For 
an indication of how, nevertheless, the groups do differ, see appendix M, section M.7. 
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the quarks: the n contains (udd), the p is (uud), and the A-quartet is (uuu, 
uud, udd, ddd). The u-d isospin doublet plays the role of the p-n doublet in 
the nuclear case, and this degree of freedom is what we now call SU(2) isospin 
flavour symmetry at the quark level, denoted by SU(2)¢. We shall denote the 
u-d quark doublet wavefunction by 


q= ( 2 ) (12.50) 


omitting now the explicit representation label (4Y, and shortening ‘Yu’ to 
just ‘w’, and similarly for ‘d’. Then, under an SU(2)¢ transformation, 


The limitation T € 3 for baryonic states can be understood in terms of their 
being composed of three T = i constituents (two of them pair to T = 1 or 
T = 0, and the third adds to T = 1 to make T = 2 or T = i. and to T = 0 to 


make T' — i. by the usual angular momentum addition rules). It is, however, 
a challenge for QCD to explain why, for example, states with four or five 
quarks should not exist (nor states of one or two quarks!), and why a state 
of six quarks, for example, appears as the deuteron, which is a loosely bound 
state of n and p, rather than as a compact B — 2 analogue of the n and p 
themselves. 

Meson states such as the pion are formed from a quark and an antiquark, 
and it is therefore appropriate at this point to explain how antiparticles are 
described in isospin terms. An antiparticle is characterized by having the 
signs of all its additively conserved quantum numbers reversed, relative to 
those of the corresponding particle. Thus if a u-quark has B is T i. T3 


a u-quark has B = —4,T = 4,73 = -i. Similarly the d has B — 


1 
TT E i and T3 = i. Note that, while T3 is an additively conserved 
quantum number, the magnitude of the isospin is not additively conserved: 
rather, it is 'vectorially' conserved according to the rules of combining angular- 
momentum-like quantum numbers, as we have seen. Thus the antiquarks d 
and ū form the T3 = +4 and T3 = -i members of an SU(2)¢ doublet, just as 
u and d themselves do, and the question arises: given that the (u, d) doublet 
transforms as in (12.51), how does the (à, d) doublet transform? 

The answer is that antiparticles are assigned to the complex conjugate of 
the representation to which the corresponding particles belong. T'hus identi- 


fying ü = u* and d= d* we have? 


qU—V'q, or ( ) = exp(—ia - 7*/2) ( 1 ) (12.52) 


for the SU(2); transformation law of the antiquark doublet. In mathemati- 
cal terms, this means (compare (12.32)) that the three matrices —2T* must 


5The overbar (ü etc.) here stands only for ‘antiparticle’, and has nothing to do with the 
Dirac conjugate ~ introduced in section 4.4. 
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represent the generators of SU(2); in the 2* representation (i.e. the complex 
conjugate of the original two-dimensional representation, which we will now 
call 2). Referring to (12.25), we see that Tf = 71,73 = —T2 and 7$ = 73. It is 
then easy to check that the three matrices —1/2,+72/2 and —73/2 do indeed 
satisfy the required commutation relations (12.28), and thus provide a valid 
matrix representation of the SU(2) generators. Also, since the third compo- 
nent of isospin is here represented by —73/2 — —73/2, the desired reversal in 
sign of the additively conserved eigenvalue does occur. 

Although the quark doublet (u, d) and antiquark doublet (ü, d) do trans- 
form differently under SU(2); transformations, there is nevertheless a sense 
in which the 2* and 2 representations are somehow the ‘same’: after all, the 
quantum numbers T' — 4,73 = ti describe them both. In fact, the two 
representations are ‘unitarily equivalent’, in that we can find a unitary matrix 
Uc such that 


Uc exp(—ia : 7*/2)U5! = exp(ia - 7/2). (12.53) 


'This requirement is easier to disentangle if we consider infinitesimal transfor- 
mations, for which (12.53) becomes 


Uc(-T*)Ug! =7, (12.54) 


Or 
Ucr Uzt =T UcnUg' = T9, Ucr Uzt = —T3. (12.55) 


Bearing the commutation relations (12.28) in mind, and the fact that 7; ! = rj, 
it is clear that we can choose Uc proportional to 72, and set 


Um ( 2 : ) (12.56) 


to obtain a convenient unitary form. From (12.52) and (12.53) we obtain 
(Ucq*) = V(Ucq*), which implies that the doublet 


Uc ( 1 ) E ( T ) (12.57) 


transforms in exactly the same way as (u,d). This result is useful, because 
it means that we can use the familiar tables of (Clebsch-Gordan) angular 
momentum coupling coefficients for combining quark and antiquark states 
together, provided we include the relative minus sign between the d and ū 
components which has appeared in (12.57). Note that, as expected, the d is 
in the T3 — +4 position, and the u is in the T3 = -i position. 

As an application of these results, let us compare the T' = 0 combination 
of the p and n states to form the (isoscalar) deuteron, and the combination 


of (u,d) and (ü,d) states to form the isoscalar w-meson. In the first, the 
isospin part of the wavefunction is Jg pn — V, V5), corresponding to the 


S = 0 combination of two spin-2 particles in quantum mechanics given by 


12.2. The flavour symmetry SU(2)¢ 17 


zx n I — I) |t)). But in the second case the corresponding wavefunction is 


Jg (dd — (-ü)u) = Jg (dd + tu). Similarly, the T = 1 T3 = 0 state describing 


the 7° is a (dd + (—t)u) = Jg (dd — uu). 

There is a very convenient alternative way of obtaining these wavefunc- 
tions, which we include here because it generalizes straightforwardly to SU(3); 
its advantage is that it avoids the use of the explicit C-G coupling coefficients, 
and of their (more complicated) analogues in SU(3). 

Bearing in mind the identifications u = u*,d = d*, we see that the T = 
0 qq combination üu + dd can be written as u*u + d*d which is just qtq, 
(recall that ! means transpose and complex conjugate). Under an SU(2), 


transformation, q > q! = Vq, so q? > qt = q! V! and 

q'q — qtd - q'ViVq = q'q (12.58) 
using V! V = 1»; thus q!q is indeed an SU(2); invariant, which means it has 
T — 0 (no multiplet partners). 


We may also construct the T = 1 q — q states in a similar way. Consider 
the three quantities v; defined by 


v—q'nq i=1,2,3. (12.59) 
Under an infinitesimal SU(2); transformation 
d = (13 ie: 7/2)q, (12.60) 
the three quantities v; transform to 
v, = q (19 — ie - T/2)7;(12 + ie - 7/2), (12.61) 


where we have used q? = q' (12 + ie- 7/2)! and then T! = 7. Retaining only 
the first-order terms in € gives (problem 12.6) 


uv, = vi + igh (rr — rjTi)d (12.62) 


where the sum on j — 1,2,3 is understood. But from (12.28) we know the 
commutator of two 7's, so that (12.62) becomes 


v; = Ut iqt ieira (sum on k = 1,2,3) 


Ui — cijk€jd TRY 
Vi — €ijk€jUk; (12.63) 
which may also be written in ‘vector’ notation as 
v —v-exv. (12.64) 


Equation (12.63) states that, under an (infinitesimal) SU(2); transforma- 
tion, the three quantities v; (i — 1,2,3) transform into specific linear combi- 
nations of themselves, as determined by the coefficients e;;; (the €'s are just 
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the parameters of the infinitesimal transformation). This is precisely what is 
needed for a set of quantities to form the basis for a representation. In this 
case, it is the T = 1 representation as we can guess from the multiplicity of 
three, but we can also directly verify it, as follows. Equation (12.49) with 
T = 1, together with (12.48), tell us how a T = 1 triplet should transform: 
namely, under an infinitesimal transformation (with 13 the unit 3 x 3 matrix), 


yO' = (des TY) nb (sumon k = 1,2,3) 
= (13 +iesT)xb (sum on j = 1,2,3) 
= (Sin + ie; (Tin be? 
= (dip iej.—iegu)U using (12.48) 
= yO = eie; t? using the antisymmetry of cije (12.65) 


which is exactly the same as (12.63). 

The reader who has worked through problem 4.2(a) will recognize the 
exact analogy between the T' = 1 transformation law (12.64) for the isospin 
bilinear g!rq, and the 3-vector transformation law (cf (4.9)) for the Pauli 
spinor bilinear $!a. 

Returning to the physics of v;, inserting (12.50) into (12.59) we find ex- 
plicitly 

vı = ūd + du, v = —itd+idu, v3 = üu — dd. (12.66) 
Apart from the normalization factor of Wok v3 may therefore be identified with 
the T3 = 0 member of the T' = 1 triplet, having the quantum numbers of the 
«?. Neither v, nor v2 has a definite value of Tz, however: rather, we need to 
consider the linear combinations 


1 
5 (1 + iv3) = ud T3 —-—1 (12.67) 


and i 
3 —ive)=du T3=+41 (12.68) 


which have the quantum numbers of the z^ and «^. The use of v, + ivo 
here is precisely analogous to the use of the ‘spherical basis’ wavefunctions 
x+iy = rsin0et!? for l = 1 states in quantum mechanics, rather than the 
*Cartesian' ones x and y. 

We are now ready to proceed to SU(3). 


E 


12.3 Flavour SU(3); 


Larger hadronic multiplets also exist, in which strange particles are grouped 
with non-strange ones. Gell-Mann (1961) and Ne’eman (1961) (see also Gell- 
Mann and Ne'eman 1964) were the first to propose SU(3)¢ as the correct 
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generalization of isospin SU(2); to include strangeness. Like SU(2), SU(3) 
is a group whose elements are matrices — in this case, unitary 3 x 3 ones, 
of unit determinant. The general group-theoretic analysis of SU(3) is quite 
complicated, but is fortunately not necessary for the physical applications we 
require. We can, in fact, develop all the results needed by mimicking the steps 
followed for SU(2). 

We start by finding the general form of an SU(3) matrix. Such matrices 
obviously act on 3-component column vectors, the generalization of the 2- 
component isospinors of SU(2). In more physical terms, we regard the three 
quark wavefunctions u,d and s as being approximately degenerate, and we 
consider unitary 3 x 3 transformations among them via 


q = Wea (12.69) 
where q now stands for the 3-component column vector 


u 
q=| ad (12.70) 
S 


and W is a 3 x 3 unitary matrix of determinant 1 (again, an overall phase 
has been extracted). The representation provided by this triplet of states 
is called the ‘fundamental’ representation of SU(3); (just as the isospinor 
representation is the fundamental one of SU(2);). 

To determine the general form of an SU(3) matrix W, we follow exactly 
the same steps as in the SU(2) case. An infinitesimal SU(3) matrix has the 
form 

Winn = 13 ix (12.71) 


where x is a 3 x 3 traceless Hermitian matrix. Such a matrix involves eight 
independent parameters (problem (12.7)) and can be written as 


x^: A/2 (12.72) 


where 7 = (ri,..., 15) and the A's are eight matrices generalizing the T ma- 
trices of (12.25). They are the generators of SU(3) in the three-dimensional 
fundamental representation, and their commutation relations define the alge- 
bra of SU(3) (compare (12.28) for SU(2)): 


[4«/2, 2/2] — ifabcàc/2, (12.73) 


where a,b and c run from 1 to 8. 

The A-matrices (often called the Gell-Mann matrices), are given in ap- 
pendix M, along with the SU(3) structure constants ifabc; the constants fabe 
are all real. 

A finite SU(3) transformation on the quark triplet is then (cf (12.32)) 


d = exp(ia - A/2)q, (12.74) 
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which also has the ‘generalized phase transformation’ character of (12.35), now 
with eight ‘phase angles’. Thus W is parametrized as W = exp(ia - A/2). 

As in the case of SU(2)., exact symmetry under SU(3)r would imply that 
the three states u, d and s were degenerate in mass. Actually, of course, this 
is not the case: in particular, while the u and d quark masses are of order 1-5 
MeV, the s quark mass is greater, of order 100 MeV. Nevertheless it is still 
possible to regard this as relatively small on a typical hadronic mass scale, so 
we may proceed to explore the physical consequences of this (approximate) 
SU(3); flavour symmetry. 

Such a symmetry implies that the eigenvalues of the A's are constants 
of the motion, but because of the commutation relations (12.73) not all of 
these operators have simultaneous eigenstates. This happened for SU(2) too, 
but there the very close analogy with SO(3) told us how the states were 
to be correctly classified, by the eigenvalues of the relevant complete set of 
mutually commuting operators. Here it is more involved - for a start, there 
are 8 matrices a. A glance at appendix M, section M.4.5, shows that two of 
the A's are diagonal (in the chosen representation), namely A3 and As. This 
means physically that for SU(3) there are two additively conserved quantum 
numbers, which in this case are of course the third component of hadronic 
isospin (since A3 is simply 73 bordered by zeros), and a quantity related to 
strangeness. Defining the hadronic hypercharge Y by Y = B + S, where B is 
the baryon number (4 for each quark) and the strangeness values are S(u) = 
S(d) = 0, S(s) = —1, we find that the physically required eigenvalues imply 
that the matrix representing the hypercharge operator is yY) = Fas: in this 


fundamental (three-dimensional) representation, denoted by the symbol 3. 


Identifyin T» = +) 3 then gives the Gell-Mann-Nishijima relation Q = 
ying £3 2 8 


T3 + Y/2 for the quark charges in units of | e |. 

So A3 and Ag are analogous to 73; what about the analogue of 7?, which 
is diagonalizable simultaneously with 73 in the case of SU(2)? Indeed, (cf 
(12.41)) 7? is a multiple of the 2 x 2 unit matrix. In just the same way one 
finds that A? is also proportional to the unit matrix: 


8 
(4/2)? = 3104/27 = i15, (12.75) 
a=1 3 

as can be verified from the explicit forms of the A-matrices given in appendix 
M, section M.4.5. Thus we may characterize the ‘fundamental triplet’ (12.70) 
by the eigenvalues of (A/2)?, A3 and Ag. The conventional way of representing 
this pictorially is to plot the states in a Y — T3 diagram, as shown in figure 
12.3. 

We may now consider other representations of SU(3);. The first impor- 
tant one is that to which the antiquarks belong. If we denote the fundamental 
three-dimensional representation accommodating the quarks by 3, then the 
antiquarks have quantum numbers appropriate to the ‘complex conjugate’ of 
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FIGURE 12.3 
The Y — T3 quantum numbers of the fundamental triplet 3 of quarks, and of 
the antitriplet 3* of antiquarks. 


this representation, denoted by 3* just as in the SU(2) case. The q wavefunc- 
tions identified as u = u*,d = d* and s = s*, then transform by 


1 


= W*àq — exp(—ia - A*/2)q (12.76) 


EL 
Il 
wm &4 Si 


instead of by (12.74). As for the 2* representation of SU(2), (12.76) means 
that the eight quantities —A*/2 represent the SU(3) generators in this 3* 
representation. Referring to appendix M, section M.4.5, one quickly sees 
that A3 and Ag are real, so that the eigenvalues of the physical observables 
re ) = —A3/2 and Y8 = —yqrs/2 (in this representation) are reversed 
relative to those in the 3, as expected for antiparticles. The ü,d and 8 states 
may also be plotted on the Y — T; diagram, figure 12.3, as shown. 

Here is already one important difference between SU(3) and SU(2): the 
fundamental SU(3) representation 3 and its complex conjugate 3* are not 
equivalent. This follows immediately from figure 12.3, where it is clear that 
the extra quantum number Y distinguishes the two representations. 

Larger SU(3)r representations can be created by combining quarks and 
antiquarks, as in SU(2)s. For our present purposes, an important one is the 
eight-dimensional (‘octet’) representation which appears when one combines 
the 3* and 3 representations, in a way which is very analogous to the three- 
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dimensional (‘triplet’) representation obtained by combining the 2* and 2 
representations of SU(2). 

Consider first the quantity wu + dd + 5s. As in the SU(2) case, this can 
be written equivalently as qq, which is invariant under q — q’ = Wq since 
W'W = 13. So this combination is an SU(3) singlet. The octet coupling is 
formed by a straightforward generalization of the SU(2) triplet coupling q'rq 
of (12.59), 

we=qraq a =1,2,...8. (12.77) 


Under an infinitesimal SU(3)¢ transformation (compare (12.61) and (12.62)), 


q' (14 — in - A/2)Aa(13 + ir - A/2)q 
qaad + itd! (dads — Ava) (12.78) 


II 


1 
Wa — Wa 


Q 


where the sum on b — 1 to 8 is understood. Using (12.73) for the commutator 
of two A's we find 
wl = Wa + iva! 2i favcàcq (12.79) 


or 
1 


Wa = Wa — fabdeMWe (12.80) 
which may usefully be compared with (12.63). Just as in the SU(2)r triplet 
case, equation (12.80) shows that, under an SU(3): transformation, the eight 
quantities wala = 1,2,...8) transform into specific linear combinations of 
themselves, as determined by the coefficients fabe (the 7’s are just the param- 
eters of the infinitesimal transformation). 

'This is, again, precisely what is needed for a set of quantities to form the 
basis for a representation — in this case, an eight-dimensional representation 
of SU(3);. For a finite SU(3)r transformation, we can ‘exponentiate’ (12.80) 
to obtain 

w' = exp(ia - G9))w (12.81) 


where w is an 8-component column vector 


wi 
w2 

w= ! (12.82) 
Ws 


such that wa = q!A4q, and where (cf (12.49) for SU(2));) the quantities G®) = 
(G9. Ge. A Ge) are 8 x 8 matrices, acting on the 8-component vector w, 


and forming an 8-dimensional representation of the algebra of SU(3): that is 
to say, the G()’s satisfy (cf (12.73)) 


[Gt cP] = ifa GO. (12.83) 
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FIGURE 12.4 
'The Y — T3 quantum numbers of the pseudoscalar meson octet. 


'The actual form of the e» matrices is given by comparing the infinitesimal 
version of (12.81) with (12.80) 


(c), = ite a231) 


as may be checked in problem 12.8, where it is also verified that the matrices 
specified by (12.84) do obey the commutation relations (12.83). 

As in the SU(2); case, the 8 states generated by the combinations q'Aaq 
are not necessarily the ones with the physically desired quantum numbers. To 
get the 7^, for example, we again need to form (t; + iw2)/2. Similarly, w4 
produces us + su and ws the combination —its + iSu, so the KF states are 
wa + iws. Similarly the K}, K? states are wg — iw7, and wg + iwy, while the 
1) (in this simple model) would be wg ~ (wu + dd — 28s), which is orthogonal 
to both the 7° state and the SU(3); singlet. In this way all the pseudoscalar 
octet of 7-partners has been identified, as shown on the Y — T diagram of 
figure 12.4. We say ‘octet of z-partners', but a reader knowing the masses 
of these particles might well query why we should feel justified in regarding 
them as (even approximately) degenerate. By contrast, a similar octet of 
vector (JP1~) mesons (the w, p, K* and K*) are all much closer in mass, 
averaging around 800 MeV; in these states the qq spins add to S = 1, while 
the orbital angular momentum is still zero. The pion, and to a much lesser 
extent the kaons, seem to be ‘anomalously light’ for some reason: we shall 
learn the likely explanation for this in chapter 15. 

There is a deep similarity between (12.84) and (12.48). In both cases, a 
representation has been found in which the matrix element of a generator is 
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minus the corresponding structure constant. Such a representation is always 
possible for a Lie group, and is called the adjoint, or regular, representation 
(see appendix M, section M.5). These representations are of particular im- 
portance in gauge theories, as we will see, since gauge quanta always belong 
to the adjoint representation of the gauged group (for example, the 8 gluons 
in SU(3).). 

Further flavours c, b and t of course exist, but the mass differences are now 
so large that it is generally not useful to think about higher flavour groups 
such as SU(4)¢ etc. Instead, we now move on to consider the field-theoretic 
formulation of global SU(2); and SU(3)¢. 


E 


12.4 Non-Abelian global symmetries in Lagrangian 
quantum field theory 


12.4.1 SU(2); and SU(3)¢ 


As may already have begun to be apparent in chapter 7, Lagrangian quantum 
field theory is a formalism which is especially well adapted for the description 
of symmetries. Without going into any elaborate general theory, we shall now 
give a few examples showing how global flavour symmetry is very easily built 
into a Lagrangian, generalizing in a simple way the global U(1) symmetries 
considered in section 7.1 and section 7.2. This will also prepare the way for 
the (local) gauge case, to be considered in the following chapter. 
Consider, for example, the Lagrangian 


É —-á(i 9 — m)ü-- di J- m)d (12.85) 


describing two free fermions ‘u’ and ‘d’ of equal mass m, with the overbar 
now meaning the Dirac conjugate for the four-component spinor fields. Note 
carefully that we are suppressing the space-time arguments of the quantum 
fields G(x), d(x). As in (12.50), we are using the convenient shorthand u = û 


and vq = d. Let us introduce 
js 
q= ( d ) (12.86) 


so that £ can be compactly written as 
Ê= Gi  — m). (12.87) 


In this form it is obvious that £ — and hence the associated Hamiltonian 71 — 
is invariant under the global U(1) transformation 


j' = ieg (12.88) 
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(cf (12.1)) which is associated with baryon number conservation. It is also 
invariant under global SU(2)s transformations acting in the flavour u-d space 
(cf (12.32)): 

g! = exp(-ia - 7/2) (12.89) 


(for the change in sign with respect to (12.51), compare section 7.1 and section 
7.2 in the U(1) case). In (12.89), the three parameters @ are independent of 
Ts 

What are the conserved quantities associated with the invariance of Ê 
under (12.89) ? Let us recall the discussion of the simpler U(1) cases studied 
in sections 7.1 and 7.2. Considering the complex scalar field of section 7.1, the 
analogue of (12.89) was just 6 — d! =e~'*¢, and the conserved quantity was 
the Hermitian operator Ng which appeared in the exponent of the unitary 
operator U that effected the transformation db > ó/ via 


¢ 2 ÜQU!, (12.90) 
with " : 
U = explia Nọ). (12.91) 
For an infinitesimal a, we have 
dx (1—ie)ó, Ü e 1 ieN,, (12.92) 
so that (12.90) becomes 
(1 —ie)à = (1 + ieNy)A(1 — ieN yg) e à + ie[Ns, à]; (12.93) 
hence we require "ns : 
We, 9] = —ó (12.94) 


for consistency. Insofar as Ng determines the form of an infinitesimal version 
of the unitary transformation operator U , it seems reasonable to call it the 
generator of these global U(1) transformations (compare the discussion after 
(12.27) and (12.35), but note that here Ny is a quantum field operator, not a 
matrix). 

Consider now the SU(2): transformation (12.89), in the infinitesimal case: 


7 = (1— ie- + /2)G. (12.95) 


Since the single U(1) parameter e is now replaced by the three parameters 
€ = (€1, €2,€3), we shall need three analogues of Ny, which we call 


~ (3) Lol 


PP = qo oa» (12.96) 


corresponding to the three independent infinitesimal SU(2) transformations. 
The generalizations of (12.90) and (12.91) are then 


g = 0 Gg Gt (12.97) 
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and ; 
UG = exp(ia TP) (12.98) 
me 4 

where the Ts are Hermitian, so that U(2) is unitary (cf (12.35)). It would 
~(4 
seem reasonable in this case too to regard the TO as providing a field 
theoretic representation of the generators of SU(2);, an interpretation we shall 

shortly confirm. In the infinitesimal case, (12.97) and (12.98) become 


(3) (3) 


nie 


(= ie- T/2)ĝ= (1 +ie T? a1 -ie T ^^y, (12.99) 


e 
using the Hermiticity of the Ts. Expanding the right-hand side of (12.99) 
to first order in e, and equating coefficients of € on both sides, (12.99) reduces 
to (problem 12.9) 


ve 

$, =-(r/2)4, (12.100) 

which is the analogue of (12.94). Equation (12.100) expresses a very specific 
aL 


commutation property of the operators T. which turns out to be satisfied 

by the expression 

il 

(3) = f t/a (12.101) 

as can be checked (problem 12.10) from the anticommutation relations of 

the fermionic fields in ĝ. We shall derive (12.101) from Noether’s theorem 

(Noether 1918) in a little while. Note that if ‘7/2’ is replaced by 1, (12.101) 

reduces to the sum of the u and d number operators, as required for the one- 

parameter U(1) case. The ‘g'rq’ combination is precisely the field-theoretic 

version of the qrq coupling we discussed in section 12.1.3. It means that the 
~(4 

three operators pe themselves belong to a T = 1 triplet of SU(2)r. 

a (4 

It is possible to verify that these T0 

Hamiltonian H: 


s do indeed commute with the 


i 
2 


ar’ hat —--i[P?^,H|]-0 (12.102) 
me 

so that their eigenvalues are conserved. That the TO are, as already sug- 

gested, a field theoretic representation of the generators of SU(2), appropriate 

to the case T — i. follows from the fact that they obey the SU(2) algebra 


(problem 12.11): 
i yd = ltijktk > . 
For many purposes it is more useful to consider the raising and lowering 


operators 
(3) _ pla) LG) 
Ppt (Tee bh iT), (12.104) 


For example, we easily find 


T0 - nz Be, (12.105) 
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which destroys a d quark and creates a u, or destroys a ü and creates a d, in 


wee 
either case raising the fo) eigenvalue by +1, since 


jo - ; faa — d dd? (12.106) 
which counts +4 for each u (or d) and —4 for each d (or ü). Thus these 
operators certainly ‘do the job’ expected of field theoretic isospin operators, 
in this isospin-1/2 case. 

In the U(1) case, considering now the fermionic example of section 7.2 for 
variety, we could go further and associate the conserved operator hw with a 
conserved current Ni : 


Ny = f Mare, Ñi = yea (12.107) 


where j 
ONG = 0. (12.108) 


The obvious generalization appropriate to (12.101) is 
(2 (4 2 " 
T? = Jue TU" = à" 74. (12.109) 


X iq 
Note that both N^ and TU" are of course functions of the space-time co- 
ordinate x, via the (suppressed) dependence of the j-fields on x. Indeed one 
can verify from the equations of motion that 


=0. (12.110) 
1 
Thus To" is a conserved isospin current operator appropriate to the T' — i 
(u, d) system; it transforms as a 4-vector under Lorentz transformations, and 
as a T = 1 triplet under SU(2) transformations. 

Clearly there should be some general formalism for dealing with all this 
more efficiently, and it is provided by a generalization of the steps followed, 
in the U(1) case, in equations (7.6)-(7.8). Suppose the Lagrangian involves 
a set of fields 4». (they could be bosons or fermions) and suppose that it is 
invariant under the infinitesimal transformation 


ài. XE —icT, i. (12.111) 


for some set of numerical coefficients Tps. Equation (12.111) generalizes (7.5). 
Then since £ is invariant under this change, 


0 = 6£ = Ti, + ——— A" (69). (12.112) 
pap 
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But . : 
2 = oh LEE (12.113) 
OU O(0^ v) 
from the equations of motion. Hence 
gu E x. =0 (12.114) 
o(p) 


which is precisely a current conservation law of the form 
B45, = 0. (12.115) 


Indeed, disregarding the irrelevant constant small parameter e, the conserved 
current is 


Dues EE (12.116) 
O(0" v.) 


Let us try this out on (12.87) with 
ój = (-ie- 7/2)4. (12.117) 


As we know already, there are now three e’s, and so three T;,,’s, namely 


Sra i(T2)ns, 4(T3)rs- For each one we have a current, for example 


Ar = 
iS BEC E j— M54 (12.118) 
and similarly for the other 7's, and so we recover (12.109). From the invari- 
ance of the Lagrangian under the transformation (12.117) there follows the 
conservation of an associated symmetry current. This is the quantum field 
theory version of Noether's theorem. 

This theorem is of fundamental significance as it tells us how to relate 
symmetries (under transformations of the general form (12.111)) to ‘current’ 
conservation laws (of the form (12.115), and it constructs the actual currents 
for us. In gauge theories, the dynamics is generated from a symmetry, in 
the sense that (as we have seen in the local U(1) of electromagnetism) the 
symmetry currents are the dynamical currents that drive the equations for 
the force field. Thus the symmetries of the Lagrangian are basic to gauge 
field theories. 

Let us look at another example, this time involving spin-0 fields. Suppose 
we have three spin-0 fields all with the same mass, and take 


n 1 ^ ^ 1 ^ ^ 1 ^ ^ 1 x x ^ 
L= 3 010" $4 + 39.020" 02 + 39.030" bs = z (6 +43 +3). (12.119) 


It is obvious that £ is invariant under an arbitrary rotation of the three d's 
among themselves, generalizing the ‘rotation about the 3-axis’ considered for 
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the $1 — dg system of section 7.1. An infinitesimal such rotation is (cf (12.64), 
and noting the sign change in the field theory case) 


p =p+ex o (12.120) 
which implies . . 
fbr = —ie, TX }., (12.121) 
with 
qp = —icars (12.122) 


as in (12.48). There are of course three conserved T operators again, and three 


i (1 ^ (1 

T's, which we call j' and T d respectively, since we are now dealing with 
a T = 1 isospin case. The a = 1 component of the conserved current in this 
case is, from (12.116), 


TU = 030^ 94 — $30" às. (12.123) 


Cyclic permutations give us the other components which can be summarised 
as 


POF = (gM qoa — (g« (e TOFO) (12.124) 
where we have written 7 
: $a 
oY) =| à (12.125) 
$3 


and * denotes transpose. Equation (12.124) has the form expected of a 
bosonic spin-0 current, but with the matrices T() appearing, appropriate 
to the T = 1 (triplet) representation of SU(2)s. 

The general form of such SU(2) currents should now be clear. For an 
isospin 7-multiplet of bosons we shall have the form 


(DITO ge 400 — (APTO) (12.126) 


where we have put the f to allow for possibly complex fields; and for an isospin 
T-multiplet of fermions we shall have 


i Dep) iD) (12.127) 


where in each case the (2T + 1) components of ¢ or 7 transforms as a T- 
multiplet under SU(2), i.e. 


pO” = exp(-ia - TO) GO (12.128) 


and similarly for $0), where T) are the 2T 4-1 x 2T-- 1 matrices representing 
the generators of SU(2); in this representation. In all cases, the integral over 
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all space of the u = 0 component of these currents results in a triplet of isospin 
operators obeying the SU(2) algebra (12.47), as in (12.103). 
The cases considered so far have all been free field theories, but SU(2)- 
invariant interactions can be easily formed. For example, the interaction 
1 


gure - à describes SU(2)-invariant interactions between a T = 5 isospinor 


(spin- i) field , and a T = 1 isotriplet (Lorentz scalar) Q. An effective inter- 


action between pions and nucleons could take the form gs UT 5 : $, allowing 
for the pseudoscalar nature of the pions (we shall see in the following section 
that ^os is a pseudoscalar, so the product is a true scalar as is required for a 
parity-conserving strong interaction). In these examples the ‘vector’ analogy 
for the T' = 1 states allows us to see that the ‘dot product’ will be invariant. 
A similar dot product occurs in the interaction between the isospinor pe) 
and the weak SU(2) gauge field W ,, which has the form 

T 


xf W, (12.129) 


gdy" 
as will be discussed in the following chapter. This is just the SU(2) dot product 
of the symmetry current (12.109) and the gauge field triplet, both of which 
are in the adjoint (T = 1) representation of SU(2). 
All of the foregoing can be generalized straightforwardly to SU(3)r. For 
example, the Lagrangian 
Ê= Gi 9 — m)å (12.130) 


with ĝ now extended to 


(12.131) 


Q> 
II 
w &,g 


describes free u, d and s quarks of equal mass m. Lis clearly invariant under 
global SU(3); transformations 


å! = exp(—ia- X/2)4, (12.132) 


as well as the usual global U(1) transformation associated with quark number 
conservation. The associated Noether currents are (in somewhat informal 
notation) 


^ EUR 
Gu — qo a —1,2,...8 (12.133) 


(note that there are eight of them), and the associated conserved 'charge 
operators’ are 


â = [arte [ 9588 a=1,2,...8, (12.134) 
which obey the SU(3) commutation relations 


(GO, G(9 = 154,60). (12.135) 
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SU(3)-invariant interactions can also be formed. A particularly impor- 
tant one is the ‘SU(3) dot-product’ of two octets (the analogues of the SU(2) 
triplets), which arises in the quark-gluon vertex of QCD (see chapters 13 and 
14): 


. ES Aa ^ fa 
=i $ dr HA. (12.136) 
f 


In (12.136), jr stands for the SU(3)« colour triplet 
d= | fo (12.137) 


where EM is any of the six quark flavour fields u, d, 6, 8, f, b, and EU are the 
8 (a = 1,2,...8) gluon fields. Once again, (12.136) has the form ‘symmetry 
current - gauge field’ characteristic of all gauge interactions. 


12.4.2 Chiral symmetry 


As our final example of a global non-Abelian symmetry, we shall introduce 
the idea of chiral symmetry, which is an exact symmetry for fermions in the 
limit in which their masses may be neglected. We have seen that the u and 
d quarks have indeed very small masses (< 5 MeV) on hadronic scales, and 
even the s quark mass (c 100 MeV) is relatively small. Thus we may certainly 
expect some physical signs of the symmetry associated with my £z ma œ 0, 
and possibly also of the larger symmetry holding when m, 7 mq z ms 0. 
As we shall see, however, this expectation leads to a puzzle, the resolution of 
which will have to be postponed until the concept of ‘spontaneous symmetry 
breaking! has been developed in Part VII. 

We begin with the simplest case of just one fermion. Since we are interested 
in the ‘small mass’ regime, it is sensible to use the representation (3.40) of 
the Dirac matrices, in which the momentum part of the Dirac Hamiltonian is 
‘diagonal’ and the mass appears as an ‘off-diagonal’ coupling: 


c 0 0 1 
aer ES: e- (1 we (12.138) 
Writing the general Dirac spinor w as 
a= ( ? i (12.139) 
x 


we have (as in (4.14), (4.15)) 


EQ = o-po+my (12.140) 
Ex (12.141) 


l 

l 
q 
8 
x> 
+ 
3 
© 
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We now recall the matrix ys introduced in section 4.2.1 


ys = ny y?48, (12.142) 


15 = ( : 2 ) (12.143) 


in this representation. The matrix y5 plays a prominent role in chiral symme- 
try, as we shall see. Its defining property is that it anticommutes with the ^^ 
matrices: 


which takes the form 


Ios, Y") = 0. (12.144) 


‘Chirality’ means ‘handedness’, from the Greek word for hand, xeip. Its 
use here stems from the fact that, in the limit m — 0 the 2-component spinors 
$, x become helicity eigenstates (cf problem 9.4), having definite ‘handedness’. 
As m — 0 we have E > |p|, and (12.140) and (12.141) reduce to 


(e - p/|pl)d (12.145) 
(o: p/p) = -& (12.146) 


so that the limiting spinor ó has positive helicity, and X negative helicity (cf 
(3.68) and (3.69)). In this m — 0 limit, the two helicity spinors are decoupled, 
reflecting the fact that no Lorentz transformation can reverse the helicity of 
a massless particle. Also in this limit, the Dirac energy operator is 


[op 0 
a- p= ( ^ — (12.147) 


| 
S 


which is easily seen to commute with y5. Thus the massless states may equiva- 
lently be classified by the eigenvalues of y5, which are clearly +1 since 42 = I. 
Consider then a massless fermion with positive helicity. It is described 


by the ‘u’-spinor ( 4 


which is an eigenstate of y5 with eigenvalue +1. 


Similarly, a fermion with negative helicity is described by ( x ) which has 


^s — —1. Thus for these states chirality equals helicity. We have to be more 

careful for antifermions, however. A physical antifermion of energy E and 

momentum p is described by a ‘v’- spinor corresponding to — E and —p; but 

with m = 0 in (12.140) and (12.141) the equations for ¢ and x remain the 

same for —E,—p as for E,p. Consider the spin, however. If the physical 

antiparticle has positive helicity, with p along the z-axis say, then s; — +3. 
1 


The corresponding v-spinor must then have s; = — 5 (see section 3.4.3) and 


must therefore be of x type (12.146). So the v-spinor for this antifermion of 
positive helicity is ( x ) which has y5 = —1. In summary, for fermions the 


y5 eigenvalue is equal to the helicity, and for antifermions it is equal to minus 
the helicity. It is the ys eigenvalue that is called the ‘chirality’. 
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In the massless limit, the chirality of ó and x is a good quantum number 
(ys commuting with the energy operator), and we may say that ‘chirality is 
conserved’ in this massless limit. On the other hand, the massive spinor w is 
clearly not an eigenstate of chirality: 


ness ( E ) á a( i i} (12.148) 


Referring to (12.140) and (12.141), we may therefore regard the mass terms 
as ‘coupling the states of different chirality’. 

It is usual to introduce operators Pry = (4) which ‘project’ out states 
of definite chirality from w: 


1 1- 
v=( JP (HE) os Prw t Pw son te, (12.149) 
so that 
o ds 08 QUA Lp Na 
a-(i*y(f)e(f)a-(1) 0 mum 
Then clearly yswR = wg and »yswr, = —wr; slightly confusingly, the notation 


‘R’, ‘L’ is used for the chirality eigenvalue. 
We now reformulate the above in field-theoretic terms. The Dirac La- 
grangian for a single massless fermion is 


Lo = di di. (12.151) 


This is invariant not only under the now familiar global U(1) transformation 
wv’ = e*q, but also under the ‘global chiral U(1)’ transformation 


bay =e My (12.152) 


where @ is an arbitrary (x-independent) real parameter. The invariance is 
easily verified: using (59,55) = 0 we have 


j^ = Pity? = ptei0rs o = ști — spe iors. (12.153) 


and then using (5^, y5) = 0, 


$a Oud! = fen, 
Wyte 9, e 7105 
= "div (12.154) 


as required. The corresponding Noether current is 


II 


3$ = dn" s, (12.155) 
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and the spatial integral of its 4 = 0 component is the (conserved) chirality 
operator 


Qs = | d'riim = [ (86-38) oe. (12.156) 


We denote this chiral U(1) by U(1)s. 

It is interesting to compare the form of Qs with that of the corresponding 
operator f did? in the non-chiral case (cf (7.51)). The difference has to 
do with their behaviour under a transformation already discussed in section 
4.2.1, namely parity. Under the parity transformation p — —p and thus, for 
(12.140) and (12.141) to be covariant under parity, we require ó > x, x > 4; 
this will ensure (as we saw in section 4.2.1) that the Dirac equation in the 
parity-transformed frame will be consistent with the one in the original frame. 
In the representation (12.138), this is equivalent to saying that the spinor wp 
in the parity-transformed frame is given by 


wp = w. (12.157) 


which implies óp = x, xp = 4. F : 

All this carries over to the field theory case, with vp(z,t) = y°w(—a, t), 
as we saw in section 7.5.1. Consider then the operator Qs in the parity- 
transformed frame: 


(Brio if Ot (o, t) ye (zt) = / d (2) 353 0-2, t)d'a 


- | std. dy = -Qs; (12.158) 


where we used [49,55] = 0 and (7°)? = 1, and changed the integration 
variable to y = —a. Hence Qs is a ‘pseudoscalar’ operator, meaning that 
it changes sign in the parity-transformed frame. We can also see this di- 
rectly from (12.156), making the interchange do x. In contrast, the non- 
chiral operator f thd? isa (true) scalar, remaining the same in the parity- 
transformed frame. 

In a similar way, the appearance of the 75 in the current operator JE = 
doy" ys) affects its parity properties: for example, the jj = 0 component hiy 
is a pseudoscalar, as we have seen. Problem 4.4(b) showed that the spatial 
parts (^s behave as an arial vector rather than a normal (polar) vector 
under parity: that is, they behave like r x p for example, rather than like 
r, in that they do not reverse sign under parity. Such a current is referred 
to generally as an ‘axial vector current’, as opposed to the ordinary vector 
currents with no ^s. 

As a consequence of (12.158), the operator Qs changes the parity of any 
state on which it acts. We can see this formally by introducing the (unitary) 
parity operator P in field theory, such that states of definite parity |+), |—) 
satisfy 


P|+) =|+), P|-)-2-|-). (12.159) 
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Equation (12.158) then implies that PQ;P~! = —Qs, following the normal 
rule for operator transformations in quantum mechanics. Consider now the 
state Qs|--). We have 


PQs|+) = (PQ;b-)PIo 
= —Qs|+) (12.160) 


showing that Qs|+) is an eigenstate of P with the opposite eigenvalue, -1. 

A very important physical consequence now follows from the fact that (in 
this simple m = 0 model) Ox is a symmetry operator commuting with the 
Hamiltonian H. We have 


HàQs|v) = QsÉ|v) = EQs|v). (12.161) 


Hence for every state |Y) with energy eigenvalue E, there should exist a state 
Qs) with the same eigenvalue E and the opposite parity: that is, chiral 
symmetry apparently implies the existence of ‘parity doublets’. 

Of course, it may reasonably be objected that all of the above refers not 
only to the massless, but also the non-interacting case. However, this is just 
where the analysis begins to get interesting. Suppose we allow the fermion 
field 4) to interact with a U(1)-gauge field A“ via the standard electromagnetic 
coupling 7 

Lint = qY v A,. (12.162) 


Remarkably enough, fo is also invariant under the chiral transformation 
(12.152), for the simple reason that the ‘Dirac’ structure of (12.162) is exactly 
the same as that of the free kinetic term m Dw: the ‘covariant derivative’ 
prescription ô” — D^ = 0" +iqA" automatically means that any ‘Dirac’ (e.g. 
5) symmetry of the kinetic part will be preserved when the gauge interaction 
is included. Thus chirality remains a ‘good symmetry’ in the presence of a 
U(1) gauge interaction. 

The generalization of this to the more physical m, ~ mq 7 0 case is quite 
straightforward. The Lagrangian (12.87) becomes 


Ê= di Jå (12.163) 
as m — 0, which is invariant under the ys-version of (12.89), namely 
q = exp(—iB - 7/245). (12.164) 


There are three associated Noether currents (compare (12.109)) 
ou Is 
= (57d (12.165) 


8 £o is also invariant under d' = e-!9?54 which is an ‘axial’ version of the global U(1) 
associated with quark number conservation. We shall discuss this additional U(1)-symmetry 
in section 18.1.1. 
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which are axial vectors, and three associated ‘charge’ operators 
(4 
T0) - n (12.166) 


which are pseudoscalars, belonging to the T—1 representation of SU(2). We 
have a new non-Abelian global symmetry, called chiral SU(2)r, which we shall 
denote by SU(2)rs. As far as their action in the isospinor u-d space is con- 
cerned, these chiral charges have exactly the same effect as the ordinary flavour 
isospin operators of (12.109). But they are pseudoscalars rather than scalars, 
and hence they flip the parity of a state on which they act. Thus, whereas 


Ne 
the isospin raising operator p is such that 


TO lay = Ju), (12.167) 


PRO will also produce a u-type state from a d-type one via 
fG a = |a) (12.168) 
+5 S ’ . 


a(l A 
but the |à) state will have opposite parity from |u}. Further, since [T' D, H] = 


0, this state |) will be degenerate with |d}. Similarly, the state |d) produced 


via TG) lu) will have opposite parity from |d), and will be degenerate with 
|u). The upshot is that we have two massless states |u), |d) of (say) positive 
parity, and a further two massless states |ü), |d) of negative parity, in this 
simple model. 

Suppose we now let the quarks interact, for example by an interaction of 
the QCD type, already indicated in (12.136). In that case, the interaction 


terms have the form 


iy" Maig + dy de (12.169) 
where . 
à, \ fd 

à—| ê |,d=| d, (12.170) 
tig : 


and the 3 x 3 A's act in the r-b-g space. Just as in the previous U(1) case, 
the interaction (12.169) is invariant under the global SU(2):5 chiral symmetry 
(12.164), acting in the u-d space. Note that, somewhat confusingly, (12.169) is 
not a simple ‘gauging’ of (12.163): a covariant derivative is being introduced, 
but in the space of a new (colour) degree of freedom, not in flavour space. In 
fact, the flavour degrees of freedom are ‘inert’ in (12.169), so that it is invariant 
under SU(2): transformations, while the Dirac structure implies that it is also 
invariant under chiral SU(2)¢5 transformations (12.164). All the foregoing can 
be extended unchanged to chiral SU(3)r5, given that QCD is ‘flavour blind’, 
and supposing that mg z 0. 
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The effect of the QCD interactions must be to bind the quark into nucle- 
ons, such as the proton (wud) and neutron (udd). But what about the equally 
possible states (wid) and (üdd), for example? These would have to be degen- 
erate in mass with (uud) and (udd), and of opposite parity. Yet such ‘parity 
doublet’ partners of the physical p and n are not observed, and so we have a 
puzzle. 

One might feel that this whole discussion is unrealistic, based as it is on 
massless quarks. Are the baryons then supposed to be massless too? If so, 
perhaps the discussion is idle, as they are evidently by no means massless. But 
it is not necessary to suppose that the mass of a relativistic bound state has 
any very simple relation to the masses of its constituents: its mass may derive, 
in part at least, from the interaction energy in the fields. Alternatively, one 
might suppose that somehow the finite mass of the u and d quarks, which of 
course breaks the chiral symmetry, splits the degeneracy of the nucleon parity 
doublets, promoting the negative parity ‘nucleon’ state to an acceptably high 
mass. But this seems very implausible, in view of the actual magnitudes of 
My and mg, compared to the nucleon masses. 

In short, we have here a situation in which a symmetry of the Lagrangian 
(to an apparently good approximation) does not seem to result in the expected 
multiplet structure of the states. The resolution of this puzzle will have to 
await our discussion of ‘spontaneous symmetry breaking’, in Part VII. 

In conclusion, we note an important feature of the flavour symmetry cur- 
rents T0" and T0» discussed in this and the preceding section. Although 
these currents have been introduced entirely within the context of strong in- 
teraction symmetries, it is a remarkable fact that exactly these currents also 
appear in strangeness-conserving semileptonic weak interactions such as 6- 
decay, as we shall see in chapter 20. (The fact that both appear is precisely 
a manifestation of parity violation in weak interactions, as we noted in sec- 
tion 4.2.1). Thus some of the physical consequences of ‘spontaneously broken 
chiral symmetry’ will involve weak interaction quantities. 


Á]. ———————————— 
Problems 


12.1 Verify that the set of all unitary 2 x 2 matrices with determinant equal 
to +1 form a group, the law of combination being matrix multiplication. 


12.2 Derive (12.18). 
12.3 Check the commutation relations (12.28). 


12.4 Show that the T;'s defined by (12.45) satisfy (12.47). 


12.5 Write out each of the 3 x 3 matrices TÉ? (i — 1,2,3) whose matrix 
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elements are given by (12.48), and verify that they satisfy the SU(2) commu- 
tation relations (12.47). 


12.6 Verify (12.62). 


12.7 Show that a general Hermitian traceless 3 x 3 matrix is parametrized 
by 8 real numbers. 


12.8 Check that (12.84) is consistent with (12.80) and the infinitesimal form 
of (12.81), and verify that the matrices G®) defined by (12.84) satisfy the 


commutation relations (12.83). 


12.9 Verify, by comparing the coefficients of €1,€2 and ea on both sides of 
(12.99), that (12.100) follows from (12.99). 


Ne 
12.10 Verify that the operators T defined by (12.101) satisfy (12.100). 
(Note: use the anticommutation relations of the fermionic operators.) 
ree 
12.11 Verify that the operators TO 
tion relations (12.103). 


given by (12.101) satisfy the commuta- 
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... The difference between a neutron and a proton is then a purely 
arbitrary process. As usually conceived, however, this arbitrariness is 
subject to the following limitations: once one chooses what to call a 
proton, what a neutron, at one space time point, one is then not free 
to make any choices at other space time points. 

It seems that this is not consistent with the localized field concept 
that underlies the usual physical theories. In the present paper we wish 
to explore the possibility of requiring all interactions to be invariant 
under independent rotations of the isotopic spin at all space time points 


—Yang and Mills (1954) 


Consider the global SU(2) isospinor transformation (12.32), written here again, 
P(x) = explia - 7/2)9 9) (z) (13.1) 


for an isospin doublet wavefunction 2) (zr). The dependence of v (3) (x) on 
the space-time coordinate z has now been included explicitly, but the parame- 
ters œ are independent of x, which is why the transformation is called a ‘global’ 
one. As we have seen in the previous chapter, invariance under this transfor- 
mation amounts to the assertion that the choice of which two base states 
— (n, p), (u,d),... — to use is a matter of convention; any such non-Abelian 
phase transformation on a chosen pair produces another equally good pair. 
However, the choice cannot be made independently at all space-time points, 
only globally. 'To Yang and Mills (1954) (cf the quotation above) this seemed 
somehow an unaesthetic limitation of symmetry: ‘Once one chooses what to 
call a proton, what a neutron, at one space-time point, one is then not free 
to make any choices at other space-time points.’ They even suggested that 
this could be viewed as ‘inconsistent with the localised field concept’, and 
they therefore ‘explored the possibility’ of replacing this global (space-time 
independent) phase transformation by the local (space-time dependent) one 


$2) (x) = expligr - o(a)/2)9 3? (a) (13.2) 


in which the phase parameters a(x) are also now functions of x = (t,x) as 


[vr 
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indicated. Notice that we have inserted a parameter g in the exponent to 
make the analogy with the electromagnetic U(1) case 


V (x) = expliax(z)]u (a) (13.3) 


even stronger: g will be a coupling strength, analogous to the electromagnetic 
charge q. The consideration of theories based on (13.2) was the fundamental 
step taken by Yang and Mills (1954); see also Shaw (1955). 

Global symmetries and their associated (possibly approximate) conserva- 
tion laws are certainly important, but they do not have the dynamical signif- 
icance of local symmetries. We saw in section 7.4 how the ‘requirement’ of 
local U(1) phase invariance led almost automatically to the local gauge theory 


of QED, in which the conserved current wryly of the global U(1) symmetry is 
‘promoted’ to the role of dynamical current which, when dotted into the gauge 
field A", gave the interaction term in Lap. A similar link between symme- 
try and dynamics appears if, following Yang and Mills, we generalize the 
non-Abelian global symmetries of the preceding chapter to local non-Abelian 
symmetries, which are the subject of the present one. 

However, as mentioned in the introduction to chapter 12, the original 
Yang-Mills attempt to get a theory of hadronic interactions by ‘localizing’ 
the flavour symmetry group SU(2) turned out not to be phenomenologically 
viable (although a remarkable attempt was made to push the idea further 
by Sakurai (1960)). In the event, the successful application of a local SU(2) 
symmetry was to the weak interactions. But this is complicated by the fact 
that the symmetry is ‘spontaneously broken’, and consequently we shall delay 
the discussion of this application until after QCD — which is the theory of 
strong interactions, but at the quark, rather than the composite (hadronic) 
level. QCD is based on the local form of an SU(3) symmetry; once again, 
however, it is not the flavour SU(3) of section 12.2, but a symmetry with 
respect to a totally new degree of freedom, colour. This will be introduced in 
the following chapter. 

Although the application of local SU(2) symmetry to the weak interactions 
will follow that of local SU(3) to the strong, we shall begin our discussion 
of local non-Abelian symmetries with the local SU(2) case, since the group 
theory is more familiar. We shall also start with the ‘wavefunction’ formalism, 
deferring the field theory treatment until section 13.3. 


ÁÁ]. - ————————— — 
13.1 Local SU(2) symmetry 
13.1.1 The covariant derivative and interactions with matter 


In this section we shall introduce the main ideas of the non-Abelian SU(2) 
gauge theory which results from the demand of invariance, or covariance, 
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under transformations such as (13.2). We shall generally use the language of 
isospin when referring to the physical states and operators, bearing in mind 
that this will eventually mean weak isospin. 

We shall mimic as literally as possible the discussion of electromagnetic 
gauge covariance in sections 2.4 and 2.5 of volume 1. As in that case, no 
free particle wave equation can be covariant under the transformation (13.2) 
(taking the isospinor example for definiteness), since the gradient terms in the 
equation will act on the phase factor a(x). However, wave equations with a 
suitably defined covariant derivative can be covariant under (13.2); physically 
this means that, just as for electromagnetism, covariance under local non- 
Abelian phase transformations requires the introduction of a definite force 
field. 

In the electromagnetic case the covariant derivative is 


D" = 0" + iqA" (a). (13.4) 


For convenience we recall here the crucial property of D”. Under a local U(1) 
phase transformation, a wavefunction transforms as (cf (13.3)) 


v(x) > (x) = expligx(z))y(z), (13.5) 


from which it easily follows that the derivative (gradient) of y transforms as 
ayx) > "Y (a) = exp(iqx(a))0" (a) + ig" x(x)exp(iax(v))v (x). (13.6) 


Comparing (13.6) with (13.5), we see that, in addition to the expected first 
term on the right-hand side of (13.6), which has the same form as the right- 
hand side of (13.5), there is an extra term in (13.6). By contrast, the covariant 
derivative of v» transforms as (see section 2.4 of volume 1) 


D'j(az) > D'^v'(z) = exp(igx(z)) D^ v(a) (13.7) 


exactly as in (13.5), with no additional term on the right-hand side. Note 
that D^ has to carry a prime also, since it contains A" which transforms to 
A'^ = A!" — O x (x) when y transforms by (13.5). The property (13.7) ensured 
the gauge covariance of wave equations in the U(1) case; the similar property 
in the quantum field case meant that a globally U(1)-invariant Lagrangian 
could be converted immediately to a locally U(1)-invariant one by replacing 
Ə! by D" (section 7.4). 

In appendix D of volume 1 we introduced the idea of ‘covariance’ in the 
context of coordinate transformations of 3- and 4-vectors. The essential notion 
was of something ‘maintaining the same form’, or ‘transforming the same 
way’. The transformations being considered here are gauge transformations 
rather than coordinate ones; nevertheless it is true that, under them, D") 
transforms in the same way as v», while Ó"4» does not. Thus the term covariant 
derivative seems appropriate. In fact, there is a much closer analogy between 
the ‘coordinate’ and the ‘gauge’ cases, which we did not present in volume 1, 
but give now in appendix N, for the interested reader. 
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We need the local SU(2) generalization of (13.4), appropriate to the local 
SU(2) transformation (13.2). Just as in the U(1) case (13.6), the ordinary 
gradient acting on v (2)(x) does not transform in the same way as v? (x): 
taking ð” of (13.2) leads to 


p(x) = expligr - a(2)/2]0^ 9 (x) 
+ igr - O“a(x)/2expligr - o(z)/2]u (x) (13.8) 


as can be checked by writing the matrix exponential exp[A] as the series 
oo 
exp[A] = ‘> A" /n! 
n=0 


and differentiating term by term. By analogy with (13.7), the key property 
we demand for our SU(2) covariant derivative D” YC) is that this quantity 
should transform like y) — i.e. without the second term in (13.8). So we 
require i 4 
(D'^J (3 (a)) = expligr - e (2)/2] Dv G (a)). (13.9) 
The definition of D^ which generalizes (13.4) so as to fulfil this requirement 
is 
D" (acting on an isospinor) = 0" + igr - W” (x)/2. (13.10) 
The definition (13.10), as indicated on the left-hand side, is only appropri- 
ate for isospinors pa): it has to be suitably generalized for other vs (see 
(13.44)). 
We now discuss (13.9) and (13.10) in detail. The ð” is multiplied implicitly 
by the unit 2 matrix, and the T's act on the two-component space of wy), 
The W" (x) are three independent gauge fields 


W" = (Wi, WẸ, W$), (13.11) 
generalizing the single electromagnetic gauge field A“. They are called SU(2) 
gauge fields, or more generally Yang-Mills fields. The term T- W" is then 
the 2 x 2 matrix 


(13.12) 


rw- we i 


3 

WË -iWL —wt 
using the 7’s of (12.25); the z-dependence of the Ws is understood. Let 
us ‘decode’ the desired property (13.9), for the algebraically simpler case of 
an infinitesimal local SU(2) transformation with parameters e(r), which are 
of course functions of x since the transformation is local. In this case, y G 
transforms by , ; 

pay = (1 +igT - €(x)/2) 2 (13.13) 
and the ‘uncovariant’ derivative gray G) transforms by 


8^4 (G = (1 + igr - e(z)/2)0^ ) +igr -0"e(a)/29 0, (13.14) 
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where we have retained only the terms linear in € from an expansion of (13.8) 
with œ — e. We have now dropped the z-dependence of the V Gs, but kept 
that of e(z), and we have used the simple ‘1’ for the unit matrix in the two- 
dimensional isospace. Equation (13.14) exhibits again an ‘extra piece’ on the 
right-hand side, as compared to (13.13). On the other hand, inserting (13.10) 
and (13.13) into our covariant derivative requirement (13.9) yields, for the 
left-hand side in the infinitesimal case, 


D'^ (3 = (8^ + igr - W /2) + igr - €()/2]9 0? (13.15) 
while the right-hand side is 
[1 + igr - e(2)/2](0" + igr - W" /2)9 9. (13.16) 


In order to verify that these are the same, however, we would need to know 
W'" — that is, the transformation law for the three W” fields. Instead, we 
shall proceed ‘in reverse’, and use the imposed equality between (13.15) and 
(13.16) to determine the transformation law of W”. 

Suppose that, under this infinitesimal transformation, 


W" = W" = W" + ôW". (13.17) 
Then the condition of equality is 


[9^ + igr/2 - (W" + 6W*)][1 + igr - e(a)/2]9? 
= [1 + igr - €(x)/2|(O" + igr - W” /2)5 9). (13.18) 


Multiplying out the terms, neglecting the term of second order involving the 
product of ÂW” and e and noting that 


O" (ev) = (OMe) + e(r) (13.19) 


we see that many terms cancel and we are left with 


i TOW 0. 0 2 ToOPE(a) 
T3 HE EN 
. o | (7: elx) TOMAS (T: WPN (T: elx) 
09 ( 2 ) ( 2 2 2 
(13.20) 
Using the identity for Pauli matrices (see problem 3.4(b)) 
o:-ao-b=a-b+io-axb (13.21) 


this yields 
T:0W" = =T -O" e(x) — gr - (E(x) x W"). (13.22) 
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Equating components of T on both sides, we deduce 


(13.23) 


The reader may note the close similarity between these manipulations and 
those encountered in section 12.1.3. 

Equation (13.23) defines the way in which the SU(2) gauge fields W^ 
transform under an infinitesimal SU(2) gauge transformation. If it were not 
for the presence of the first term 0“e(a) on the right-hand side, (13.23) would 
be simply the (infinitesimal) transformation law for the T' — 1 triplet repre- 
sentation of SU(2) — see (12.64) and (12.65) in section 12.1.3. As mentioned at 
the end of section 12.2, the T' = 1 representation is the ‘adjoint’, or ‘regular’, 
representation of SU(2), and this is the one to which gauge fields belong, in 
general. But there is the extra term —O"e(z). Clearly this is directly analo- 
gous to the —Ó"x(x) term in the transformation of the U(1) gauge field A"; 
here, an independent infinitesimal function e;(r) is required for each compo- 
nent W/'(x). If the €'s were independent of x, then 0“e(x) would of course 
vanish and the transformation law (13.23) would indeed be just that of an 
SU(2) triplet. Thus we can say that under global SU(2) transformations, the 
W” behave as a normal triplet. But under local SU(2) transformations they 
acquire the additional —O"e(x) piece, and thus no longer transform ‘prop- 
erly', as an SU(2) triplet. In exactly the same way, ary) did not transform 
‘properly’ as an SU(2) doublet, under a local SU(2) transformation, because 
of the second term in (13.14), which also involves 0“e(x). The remarkable re- 
sult behind the fact that D^v(3) does transform ‘properly’ under local SU(2) 
transformations, is that the extra term in (13.23) precisely cancels that in 
(13.14)! 

To summarize progress so far: we have shown that, for infinitesimal trans- 
formations, the relation 


(D'^4j G^) = [1 + igr - e(2)/2] (DG?) (13.24) 


(where D" is given by (13.10)) holds true if in addition to the infinitesimal 
local SU(2) phase transformation on (2) 


yay! = D + igr - e(2)/2)9 0? (13.25) 
the gauge fields transform according to 
W'" = W" — 0"e(x) — gle(x) x W"]. (13.26) 


In obtaining these results, the form (13.10) for the covariant derivative has 
been assumed, and only the infinitesimal version of (13.2) has been treated 
explicitly. It turns out that (13.10) is still appropriate for the finite (non- 
infinitesimal) transformation (13.2), but the associated transformation law 
for the gauge fields is then slightly more complicated than (13.26). Let us 
write 


U(a(x)) = exp[igr - o (x)/2] (13.27) 
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so that v (3? transforms by 
pe” = Ulale) y. (13.28) 
'Then we require 
Dye" = U(o(z)) Dj). (13.29) 
The left-hand side is 
(Ə! + igr - W"" /2)U (aa) v G? 
= (LUW)  UO^q) + igr- W /2Uv), (13.30) 
while the right-hand side is 


(4) 


U(8" +igr - W” /2 (2 (13.31) 
The Ury) terms cancel leaving 
(9"U)u (9 +igr - W /2Uy®) = Uigr - W" /2 0). (13.32) 


Since this has to be true for all (two-component) ij 3J's, we can treat it as an 
operator equation acting in the space of y Gs to give 


Q0"U rigr-W""/2U = Uigr - W"/2, (13.33) 
or equivalently 


1 i 1 

tw = ,(0"U)U"* +U57- WU}, (13.34) 
which defines the (finite) transformation law for SU(2) gauge fields. Problem 
13.1 verifies that (13.34) reduces to (13.26) in the infinitesimal case a(x) > 
e(z). 


Suppose now that we consider a Dirac equation for wa): 
(iy,0^ — my? =0 (13.35) 


where both the ‘isospinor’ components of y GG are four-component Dirac 
spinors. We assert that we can ensure local SU(2) gauge covariance by re- 
placing O" in this equation by the covariant derivative of (13.10). Indeed, we 
have 


U(a(z)) in. D" — mu? iy, U(oz))|D^ Y) — mU (a(s) 


- iy, Dye” Z my (13.36) 


II 


using equations (13.9) and (13.28). Thus if 


(iq, D" — my? (13.37) 
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FIGURE 13.1 
Vertex for isospinor-W interaction. 
then 
(iyu D — m)y (G2? =0, (13.38) 


proving the asserted covariance. In the same way, any free particle wave 
equation satisfied by an ‘isospinor’ ya) — the relevant equation is determined 
by the Lorentz spin of the particles involved — can be made locally covariant 
by the use of the covariant derivative D", just as in the U(1) case. 

The essential point here, of course, is that the locally covariant form in- 
cludes interactions between the 7)(2)’s and the gauge fields W", which are 
determined by the local phase invariance requirement (the 'gauge principle"). 
Indeed, we can already begin to find some of the Feynman rules appropriate 
to tree graphs for SU(2) gauge theories. Consider again the case of an SU(2) 
isospinor fermion, VG, obeying equation (13.38). This can be written as 


(i 9 — m)u(? = g(r/2) Wy). (13.39) 


In lowest-order perturbation theory the one-W emission/absorption process 
is given by the amplitude (cf (8.39)) for the electromagnetic case) 


-ig n B® (n [2j pO Wate (13.40) 


exactly as advertized (for the field-theoretic vertex) in (12.129). The ma- 
trix degree of freedom in the 7's is sandwiched between the two-component 
isospinors v3); the ^ matrix acts on the four-component (Dirac) parts of 
yo), The external W” field is now specified by a spin-1 polarization vector 
€", like a photon, and by an ‘SU(2) polarization vector’ a”(r = 1,2,3) which 
tells us which of the three SU(2) W-states is participating. The Feynman rule 
for figure 13.1 is therefore 


-ig(r /2) Yu (13.41) 


which is to be sandwiched between spinors/isospinors uj, Us and dotted into 
c" and a^. (13.41) is a very economical generalization of rule (ii) in Comment 
(3) of section 8.3.1. 

The foregoing is easily generalized to SU(2) multiplets other than doublets. 
We shall change the notation slightly to use t instead of T for the ‘isospin’ 
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quantum number, so as to emphasize that it is not the hadronic isospin, for 
which we retain T; t will be the symbol used for the weak isospin to be 
introduced in chapter 20. The general local SU(2) transformation for a t- 
multiplet is then 


y => y® = expliga(x) - "T? Jy (13.42) 
where the (2t + 1) x (2t + 1) matrices T. (i = 1,2,3) satisfy (cf (12.47)) 
PTO) ie TP. (13.43) 
'The appropriate covariant derivative is 
D" = 9" +igT® . WH (13.44) 


which is a (2t + 1) x (2t + 1) matrix acting on the (2t + 1) components of 
Ww. The gauge fields interact with such ‘isomultiplets’ in a universal way — 
only one g, the same for all the particles — which is prescribed by the local 
covariance requirement to be simply that interaction which is generated by the 
covariant derivatives. The fermion vertex corresponding to (13.44) is obtained 
by replacing 7/2 in (13.40) by T9. 

We end this section with some comments: 


(i) It is a remarkable fact that only one constant g is needed. This is not the 
same as in electromagnetism. There, each charged field interacts with the 
gauge field A" via a coupling whose strength is its charge (e, —e, 2e, —5e . . .). 
The crucial point is the appearance of the quadratic g? multiplying the 
commutator of the T’s, [r - e, r - W], in the W transformation (equation 
(13.20)). In the electromagnetic case, there is no such commutator — the 
associated U(1) phase group is Abelian. As signalled by the presence of 
g^, a commutator is a non-linear quantity, and the scale of quantities ap- 
pearing in such commutation relations is not arbitrary. It is an instructive 
exercise to check that, once 90W'" is given by equation (13.23) — in the 
SU(2) case — then the g’s appearing in (3) (equation (13.13)) and y 
(via the infinitesimal version of equation (13.42)) must be the same as the 
one appearing in ÂW”. 


(ii) According to the foregoing argument, it is actually a mystery why electric 
charge should be quantized. Since it is the coupling constant of an Abelian 
group, each charged field could have an arbitrary charge from this point 
of view: there are no commutators to fix the scale. This is one of the 
motivations of attempts to ‘embed’ the electromagnetic gauge transfor- 
mations inside a larger non-Abelian group structure. Such is the case, for 
example, in ‘grand unified theories’ of strong, weak and electromagnetic 
interactions. 
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(iii) Finally we draw attention to the extremely important physical significance 
of the second term W” (equation (13.23)). The gauge fields themselves 
are not ‘inert’ as far as the gauge group is concerned: in the SU(2) case 
they have ‘isospin’ 1, while for a general group they belong to the regular 
representation of the group. This is profoundly different from the elec- 
tromagnetic case, where the gauge field A" for the photon is of course 
uncharged: quite simply, e = 0 for a photon, and the second term in 
(13.23) is absent for A". The fact that non-Abelian (Yang-Mills) gauge 
fields carry non-Abelian ‘charge’ degrees of freedom means that, since 
they are also the quanta of the force field, they will necessarily interact 
with themselves. Thus a non-Abelian gauge theory of gauge fields alone, 
with no ‘matter’ fields, has non-trivial interactions and is not a free theory. 


We shall examine the form of these ‘self-interactions’ in section 13.3.2. 
First, we need to find the equivalent, for the Yang-Mills field, of the Maxwell 
field strength tensor F"", which gave us the gauge-invariant formulation of 
Maxwell's equations, and in terms of which the Maxwell Lagrangian can be 
immediately written down. 


13.1.2 The non-Abelian field strength tensor 


A simple way of arriving at the desired quantity is to consider the commutator 
of two covariant derivatives, as we can see by calculating it for the U(1) case. 
We find 

[D", D"]  z (D" D" — D" D" = ieF"" i (13.45) 


as is verified in problem 13.2. Equation (13.45) suggests that we will find the 
SU(2) analogue of F"" by evaluating 


[D^, p" yG) (13.46) 


where as usual 
D" (on (9) = 0" igr - W^ J2. (13.47) 


Problem 13.3 confirms that the result is 
[D", D] y (9 = igr/2- ("WY — 0"W" — 9W" x W")yG?; — (13.48) 


the manipulations are very similar to those in (13.20)-(13.23). Noting the 
analogy between the right-hand side of (13.48) and (13.45), we accordingly 
expect the SU(2) ‘curvature’ or field strength tensor, to be given by 


FY = o9 W” — oW" — gW" x WY (13.49) 
or, in component notation, 


FEY = OWY — 8° WE — geij WIWg. (13.50) 
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This tensor is of fundamental importance in a (non-Abelian) gauge theory. 
Since it arises from the commutator of two gauge-covariant derivatives, we are 
guaranteed that it itself is gauge covariant — that is to say, ‘it transforms under 
local SU(2) transformations in the way its SU(2) structure would indicate’. 
Now F"" has clearly three SU(2) components and must be an SU(2) triplet: 
indeed, it is true that under an infinitesimal local SU(2) transformation 


p" = pU" ge(x) x FH (13.51) 


which is the expected law (cf (12.64)) for an SU(2) triplet. Problem 13.4 
verifies that (13.51) follows from (13.49) and the transformation law (13.23) 
for the W” fields. Note particularly that F"" transforms ‘properly’, as an 
SU(2) triplet should, without the 0" part which appears in óW". 

This non-Abelian F"" is a much more interesting object than the Abelian 
F"" (which is actually U(1)-gauge invariant, of course: P'"" = pev), FRY 
contains the gauge coupling constant g, confirming (cf comment(c) in sec- 
tion 13.1.1) that the gauge fields themselves carry SU(2) ‘charge’, and act 
as sources for the field strength. Appendix N shows how these field strength 
tensors may be regarded as analogous to geometrical curvatures. 

It is now straightforward to move to the quantum field case and construct 
the SU(2) Yang-Mills analogue of the Maxwell Lagrangian -iF m. pr. Tt is 
simply -ifn . F"", the SU(2) ‘dot product’ ensuring SU(2) invariance (see 
problem 13.5), even under local transformation, in view of the transformation 
law (13.51). But before proceeding in this way we first need to introduce local 
SU(3) symmetry. 
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13.2 Local SU(3) Symmetry 


Using what has been done for global SU(3) symmetry in section 12.2, and 
the preceding discussion of how to make a global SU(2) into a local one, it 
is straightforward to develop the corresponding theory of local SU(3). This 
is the gauge group of QCD, the three degrees of freedom of the fundamental 
quark triplet now referring to ‘colour’, as will be further discussed in chapter 
14. We denote the basic triplet by w, which transforms under a local SU(3) 
transformation according to 


V = exp[ig.A - o (x)/2]v, (13.52) 


which is the same as the global transformation (12.74) but with the 8 constant 
parameters a replaced by x-dependent ones, and with a coupling strength gs 
inserted. The SU(3)-covariant derivative, when acting on an SU(3) triplet 4%, 
is given by the indicated generalization of (13.10), namely 


D" (acting on SU(3) triplet) = 0" + ig,A/2- A" (13.53) 
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where Ai’, A5,... A? are eight gauge fields which are called gluons. The cou- 
pling is denoted by ‘gs’ in anticipation of the application to strong interactions 
via QCD. 

The infinitesimal version of (13.52) is (cf (13.13)) 


W = (1+ ig: (2/2) (13.54) 
where ‘1’ stands for the unit matrix in the three-dimensional space of com- 
ponents of the triplet w. As in (13.14), it is clear that O^) will involve an 
‘unwanted’ term 0"n(a). By contrast, the desired covariant derivative D^ 
should transform according to 


D''/ = (1+ ig;A - n(x) /2)D" (13.55) 


without the 0"n(x) term. Problem 13.6 verifies that this is fulfilled by having 
the gauge fields transform by 


A = Al — O"n«(x) — gs fave (x) AE. (13.56) 


Comparing (13.56) with (12.80) we can identify the term in fabe as telling us 
that the 8 fields A^ transform as an SU(3) octet, the 7’s now depending on 
x, of course. This is the adjoint, or regular representation of SU(3), as we 
have now come to expect for gauge fields. However, the O"rn,(x) piece spoils 
this simple transformation property under local transformations. But it is 
just what is needed to cancel the corresponding ô” n(x) term in 0“, leaving 
D" transforming as a proper triplet via (13.55). The finite version of (13.56) 
can be derived as in section 13.1 for SU(2), but we shall not need the result 
here. 
As in the SU(2) case, the free Dirac equation for an SU(3)-triplet v, 


(iy,0^ — mY = 0, (13.57) 


can be *promoted' into one which is covariant under local SU(3) transforma- 
tions by replacing 0" by D" of (13.53), leading to 


(i g— m)v = g,A/2- Av (13.58) 


(compare (13.39)). This leads immediately to the one gluon emission ampli- 
tude (see figure 13.2) 


—igs / rA 2v) - Ay dz (13.59) 


as already suggested in section 12.3.1: the SU(3) current of (12.133) — but 
this time in colour space — is ‘dotted’ with the gauge field. The Feynman rule 
for figure 13.2 is therefore 

—ig,A4/2 4". (13.60) 
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FIGURE 13.2 
Quark-gluon vertex. 


The SU(3) field strength tensor can be calculated by evaluating the com- 
mutator of two D’s of the form (13.53); the result (problem 13.7) is 


FP" = 0^ A" — 0" A! — gs fas A! AY (13.61) 


which is closely analogous to the SU(2) case (13.50) (the structure constants 
of SU(2) are given by ie;j;, and of SU(3) by ifabc). Once again, the crucial 
property of F/"" is that, under local SU(3) transformations it develops no 
*Q" ns? part, but transforms as a ‘proper’ octet: 


FEY = FRY — gs fabe (x) FE”. (13.62) 


This allows us to write down a locally SU(3)-invariant analogue of the Maxwell 
Lagrangian 


1 
uu (13.63) 


by dotting the two octets together. 

It is now time to consider locally SU(2)- and SU(3)-invariant quantum 
field Lagrangians and, in particular, the resulting self-interactions among the 
gauge quanta. 


EE: SeSe 


13.3 Local non-Abelian symmetries in Lagrangian 
quantum field theory 


13.3.1 Local SU(2) and SU(3) Lagrangians 


We consider here only the particular examples relevant to the strong and elec- 
troweak interactions of quarks: namely, a (weak) SU(2) doublet of fermions in- 
teracting with SU(2) gauge fields W/', and a (strong) SU(3) triplet of fermions 
interacting with the gauge fields A". We follow the same steps as in the U(1) 
case of chapter 7, noting again that for quantum fields the sign of the expo- 
nents in (13.2) and (13.52) is reversed, by convention; thus (12.89) is replaced 
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FIGURE 13.3 
SU(2) gauge-boson propagator. 


by its local version 
d' = exp(—-igá(x) - 7/2) (13.64) 


and (12.132) by 
= exp(—igs@(x) - A/2)4. (13.65) 


A 
q 
Correspondingly, the e in (13.23) and the 7’s in (13.56) become field operators, 
with a reversal of sign. 
The globally SU(2)-invariant Lagrangian (12.87) becomes locally SU(2)- 
invariant if we replaced 0" by D" of (13.10), with W^ now a quantum field: 


(D — m) 
(i 9 — m) 


with an interaction of the form ‘symmetry current (12.109) dotted into the 
gauge field'. To this we must add the SU(2) Yang-Mills term 


LD, local SU(2) 


d 
d 


Kn oan, 


- gi" r/24-W,, (13.66) 


1. a [LV 


Ly_m,su(2) = -gfe FO (13.67) 


to get the local SU(2) analogue of Lorp. It is not possible to add a mass term 
for the gauge fields of the form iw" : W,, since such a term would not be 
invariant under the gauge transformations (13.26) or (13.34) of the W-fields. 
Thus, just as in the U(1) (electromagnetic) case, the W-quanta of this theory 
are massless. We presumably also need a gauge-fixing term for the gauge 
fields, as in section 7.3.2, which we can take to be! 


1 

26 
The Feynman rule for the fermion-W vertex is then the same as already given 
in (13.41), while the W-propagator is (figure 13.3) 


Lai = -z (0 W" aw"). (13.68) 


i [-gh” + (1 - ^k" /k?] 
k? + ie 
Before proceeding to the SU(3) case, we must now emphasize three respects 


83, (13.69) 


1 We shall see in section 13.5.3 that in the non-Abelian case this gauge-fixing term does 
not completely solve the problem of quantizing such gauge fields; however, it is adequate 
for tree graphs. 
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in which our local SU(2) Lagrangian is not suitable (yet) for describing weak 
interactions. First, weak interactions violate parity, in fact ‘maximally’, by 
which is meant that only the ‘left-handed’ part my of the fermion field enters 
the interactions with the W” fields, where ij, = (4 2) w; for this reason 
the weak isospin group is called SU(2);. Secondly, the physical W- are of 
course not massless, and therefore cannot be described by propagators of the 
form (13.69). And thirdly, the fermion mass term violates the ‘left-handed’ 
SU(2) gauge symmetry, as the discussion in section 12.3.2 shows. In this 
case, however, the chiral symmetry which is broken by fermion masses in the 
Lagrangian is a local, or gauge, symmetry (in section 12.3.2 the chiral flavour 
symmetry was a global symmetry). If we want to preserve the chiral gauge 
symmetry SU(2); — and it is necessary for renormalizability — then we shall 
have to replace the simple fermion mass term in (13.66) by something else, as 
will be explained in chapter 22. 

The locally SU(3),-invariant Lagrangian for one quark triplet (cf (12.137)) 


f 
ĉ&=| fe (13.70) 
fe 
where ‘f’ stands for ‘flavour’, and ‘r, b, and g’ for ‘red, blue, and green’, is 
ar Lot 4 eee by 
d D — me)de - t Fa, — gg 0» 4 0,2) (13.71) 


where D" is given by (13.53) with A” replaced by A", and the footnote 
before equation (13.68) also applies here. This leads to the interaction term 
(cf (13.59)) 


—g.de y" A/2ár - Ay (13.72) 


and the Feynman rule (13.60) for figure 13.2. Once again, the gluon quanta 
must be massless, and their propagator is the same as (13.69), with 6;; > 
dab (a,b = 1,2,...8). The different quark flavours are included by simply 
repeating the first term of (13.71) for all flavours: 


» üp — me)ĝr, (13.73) 


which incorporates the hypothesis that the SU(3).-gauge interaction is ‘flavour- 
blind’, i.e. exactly the same for each flavour. Note that although the flavour 
masses are different, the masses of different ‘coloured’ quarks of the same 
flavour are the same (my Z ma, Mu,r = Mu,b = my). 

The Lagrangians (13.66)-(13.68), and (13.71), though easily written down 
after all this preparation, are unfortunately not adequate for anything but 
tree graphs. We shall indicate why this is so in section 13.3.3. Before that, we 
want to discuss in more detail the nature of the gauge-field self-interactions 
contained in the Yang-Mills pieces. 
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13.3.2 Gauge field self-interactions 


We start by pointing out an interesting ambiguity in the prescription for 
‘covariantizing’ wave equations which we have followed, namely ‘replace 0" 
by D¥’. Suppose we wished to consider the electromagnetic interactions of 
charged massless spin-1 particles, call them X's, carrying charge e. The stan- 
dard wave equation for such free massless vector particles would be the same 
as for A", namely 


X" — 8^8" X, — 0. (13.74) 


To *covariantize' this (i.e. introduce the electromagnetic coupling) we would 
replace 0" by D" = O” + ieA so as to obtain 


D? X” — DD” X, = 0. (13.75) 


But this procedure is not unique: if we had started from the perfectly equiv- 
alent wave equation 


X" — Q0" X, =0 (13.76) 


we would have arrived at 
D?x" — D'D"X, =0 (13.77) 
which is not the same as (13.75), since (cf (13.45)) 
[D", D"] = ieF"". (13.78) 


The simple prescription 04 — D^ has, in this case, failed to produce a 
unique wave equation. We can allow for this ambiguity by introducing an 
arbitrary parameter 6 in the wave equation, which we write as 


D? X" — D"D" X, -FieóF"" X, = 0. (13.79) 


The ó term in (13.79) contributes to the magnetic moment coupling of the 
X-particle to the electromagnetic field, and is called the ‘ambiguous magnetic 
moment’. Just such an ambiguity would seem to arise in the case of the 
charged weak interaction quanta W= (their masses do not affect this argu- 
ment). For the photon itself, of course, e = 0 and there is no such ambiguity. 

It is important to be clear that (13.79) is fully U(1) gauge-covariant, so that 
ô cannot be fixed by further appeal to the local U(1) symmetry. Moreover, it 
turns out that the theory for arbitrary ó is not renormalizable (though we shall 
not show this here): thus the quantum electrodynamics of charged massless 
vector bosons is in general non-renormalizable. 

However, the theory is renormalizable if — to continue with the present 
terminology — the photon, the X-particle, and its antiparticle the X are the 
members of an SU(2) gauge triplet (like the W's), with gauge coupling con- 
stant e. This is, indeed, very much how the photon and the W~ are ‘unified’, 
but there is a complication (as always!) in that case, having to do with the 
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necessity for finding room in the scheme for the neutral weak boson Z° as 
well. We shall see how this works in chapter 19; meanwhile we continue with 
this X — y model. We shall show that when the X — y interaction contained 
in (13.79) is regarded as a 3 — X vertex in a local SU(2) gauge theory, the 
value of ó has to equal 1; for this value the theory is renormalizable. In this 
interpretation, the X^ wave function is identified with JS (XT --1XP) and 
X" with US (XT —iX7)' in terms of components of the SU(2) triplet X/, 
while A^ is identified with X7. 
Consider then equation (13.79) written in the form? 


X^-p'o x = VX" (13.80) 
where 
Vx" = -ie([0"(A,X") + A"0,X"| 
— (1+8) [Q" (A^ X,)) + A"8^ X,] 
+ 6 [O"(A" X,)) + A"0" X,] ), (13.81) 


and we have dropped terms of O(e?) which appear in the ‘D?’ term; we shall 
come back to them later. The terms inside the ( } brackets have been written 
in such a way that each [| ] bracket has the structure 


9(AX) + A(OX) (13.82) 


which will be convenient for the following evaluation. 
The lowest-order (O(e)) perturbation theory amplitude for ‘X — X^ under 
the potential V is then 


-i f XOV Xda. (13.83) 


Inserting (13.81) into (13.83) clearly gives something involving two ‘X’-wave- 
functions and one ‘A’ one, i.e. a triple-X vertex (with A^ = X^), shown in 
figure 13.4. To obtain the rule for this vertex from (13.83), consider the first 
| ] bracket in (13.81). It contributes 


-i(—ie) / X7 2)(0" (Xa, (3) X"(1)) + X1(3)8, X^(1)) d*z (13.84) 


where the (1), (2), (3) refer to the momenta as shown in figure 13.4, and for 
reasons of symmetry are all taken to be ingoing; thus 


X2 (3) = eexp(—iks - x) (13.85) 


?'The sign chosen for V here apparently differs from that in the KG case (3.101), but 
it does agree when allowance is made, in the amplitude (13.83), for the fact that the dot 
product of the polarization vectors is negative (cf (7.87)). 
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FIGURE 13.4 
Triple-X vertex. 


for example. The first term in (13.84) can be easily evaluated by a partial 
integration to turn the 0” onto the X7(2), while in the second term 0, acts 
straightforwardly on X“(1). Omitting the usual (27)* 64 energy-momentum 
conserving factor, we find (problem 13.8) that (13.84) leads to the amplitude 


1€€1 ' €9 (kı 2 k2) ES. (13.86) 
In a similar way, the other terms in (13.83) give 
—ieó(ei ET EF” ko VEO ER VET kı) (13.87) 


and 
+ie(1 + d)(€2 ESE * ko — €: :€3:€9-* ky). (13.88) 


Adding all the terms up and using the 4-momentum conservation condition 
ky + ko + k3 = 0 (13.89) 
we obtain the vertex 
-Hie(ei + €2 (ki — k2) -€3 + €2 < €3 (Óko — ka) €1 - €: €1 (k3 —ók1)- eo). (13.90) 


It is quite evident from (13.90) that the value ô = 1 has a privileged role, 
and we strongly suspect that this will be the value selected by the proposed 
SU(2) gauge symmetry of this model. We shall check this in two ways: in the 
first, we consider a ‘physical’ process involving the vertex (13.90), and show 
how requiring it to be SU(2)-gauge invariant fixes ó to be 1; in the second, we 
*unpack' the relevant vertex from the compact Yang-Mills Lagrangian — ix "a 
AE 

The process we shall choose is X +d — X +d where d is a fermion (which 
we call a quark) transforming as the T3 — -i component of a doublet under 
the SU(2) gauge group, its Ts = +4 partner being the u. There are two 
contributing Feynman graphs, shown in figure 13.5(a) and (b). Consider first 
the amplitude for figure 13.5(a). We use the rule of figure 13.1, with the 7- 
matrix combination T} = (ri + ir2)/ V2 corresponding to the absorption of 
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(a) 


FIGURE 13.5 
Tree graphs contributing to X +d > X +d. 


the positively charged X, and T- = (r1 — iT2)/ V2 for the emission of the X. 
Then figure 13.5(a) is 


i 


Bit KÁ-m 


1 


(~ie)? (py) 7- fo E Ab) pi) (13.91) 


where 
pe = ( ) (13.92) 


and we have chosen real polarization vectors. Using the explicit forms (12.25) 
for the 7-matrices, (13.91) becomes 


ye 1 i 1 
( ie) dp) 75 m fa — m ya fd 

We must now discuss how to implement gauge invariance. In the QED case 
of electron Compton scattering (section 8.6.2), the test of gauge invariance was 
that the amplitude should vanish if any photon polarization vector e” (k) was 
replaced by k” — see (8.165). This requirement was derived from the fact that a 
gauge transformation on the photon A" took the form A" — A'" = A" — 0" x, 
so that, consistently with the Lorentz condition, e" could be replaced by 
e" = e"-- Bk" (cf 8.163) without changing the physics. But the SU(2) analogue 
of the U(1) gauge transformation is given by (13.26), for infinitesimal e’s, and 
although there is indeed an analogous '—Ó"e' part, there is also an additional 
part (with g — e in our case) expressing the fact that the X's carry SU(2) 
charge. However this extra part does involve the coupling e. Hence, if we were 
to make the full change corresponding to (13.26) in a tree graph of order e?, 
the extra part would produce a term of order e?. We shall take the view that 
gauge invariance should hold at each order of perturbation theory separately; 
thus we shall demand that the tree graphs for X-d scattering, for example, 
should be invariant under e" — k” for any e. 

The replacement e — kı in (13.93) produces the result (problem 13.9) 


(p1). (13.93) 


Cie) dipa) rodips) (13.94) 
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FIGURE 13.6 
Tree graphs contributing to y + X > 7+ X. 


where we have used the Dirac equation for the quark spinors of mass m. The 
term (13.94) is certainly not zero, but we must of course also include the 
amplitude for figure 13.5(b). Using the vertex of (13.90) with suitable sign 
changes of momenta, and the photon propagator of (7.119), and remembering 
that d has 73 = —1, the amplitude for figure 13.5(b) is 


ie[ey ' €2 (ky + ka) cT €ou61- ( óko — ko 4 kı) F€1u€2- (Ko — kı — ôkı)] 


a x [-ied(p2) (-5) yv d(pi)]; (13.95) 


where q? = (kı — k2)? = —2k;- k2 using k? = k2 = 0, and where the €- 
dependent part of the y-propagator vanishes since d(p2) dd(p1) = 0. We now 
leave it as an exercise (problem 13.10) to verify that, when e; — kı in (13.95), 
the resulting amplitude does exactly cancel the contribution (13.94), provided 
that 6 = 1. Thus the X — X —» vertex is, assuming the SU(2) gauge symmetry, 


iele1 - €» (kı — k2) + €3 + €g- €3 (k2 — ka) - €1 + es €1 (ka — k1)- eo]. (13.96) 

The verification of this non-Abelian gauge invariance to order e? is, of 
course, not a proof that the entire theory of massless X quanta, y’s and quark 
isospinors wil be gauge invariant if 6 = 1. Indeed, having obtained the 
X — X — y vertex, we immediately have something new to check: we can see if 
the lowest-order ^» — X scattering amplitude is gauge invariant. The X — X — y 
vertex will generate the O(e?) graphs shown in figure 13.6, and the dedicated 
reader may check that the sum of these amplitudes is not gauge invariant, 
again in the (tree-graph) sense of not vanishing when any « is replaced by the 
corresponding k. But this is actually correct. In obtaining the X — X — y 
vertex we dropped an O(e?) term involving the three fields A, A and X, in 
going from (13.81) to (13.90): this will generate an O(c?)) y -9 —X—- X 
interaction, figure 13.7, when used in lowest-order perturbation theory. One 
can find the amplitude for figure 13.7 by the gauge invariance requirement 
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^ ^ 


FIGURE 13.7 
y—y—X-KX vertex. 


applied to figures 13.6 and 13.7, but it has to be admitted that this approach 
is becoming laborious. It is, of course, far more efficient to deduce the vertices 
from the compact Yang-Mills Lagrangian -1X ub X. , which we shall now 
do; nevertheless, some of the physical implications of thou couplings, such as 
we have discussed above, are worth exposing. 

The SU(2) Yang-Mills Lagrangian for the SU(2) triplet of gauge fields X ^ 
is 


Lovo = =K 2 ue (13.97) 
where 
R” BOX 9X Ux WX. (13.98) 
É» YM can be unpacked a bit into 
- g0,X,-9,4,) (9 X") 
+ e(X, x X,): aux” 
= le [ra quy E xu Xa. (13.99) 


The X — X — y vertex is in the ‘e’ term, the X — X — y — y one in the ʻe?” 
term. We give the form of the latter using SU(2) ‘i,j,k’ labels, as shown in 
figure 13.8: 


+2 
—1€ [€ije€mne(€1 |*€3€2:€4 — €1 ` €4 €2° €3) 
^ €in£€jmt(€1 '€2€3:€4 — €1 ` €3 €2° €4) 


+ €imt€njt(€1 * €4 €2 * €3 — €1 ` €2 €3 ` €4)] (13.100) 


The reason for the collection of terms seen in (13.96) and (13.100) can be 
understood as follows. Consider the 3 — X vertex 


(ko, €2, j; ka, eg, k | (X, x X,)-0" X | ke, i) (13.101) 
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FIGURE 13.8 
4 — X vertex. 


for example. When each X is expressed as a mode expansion, and the initial 
and final states are also written in terms of appropriate G@’s and á''s, the 
amplitude will be a vacuum expectation value (vev) of six @’s and ât’s; the 
different terms in (13.96) arise from the different ways of getting a non-zero 
value for this vev, by manipulations similar to those in section 6.3. 

We end this chapter by presenting an introduction to the problem of quan- 
tizing non-Abelian gauge field theories. Our aim will be, first, to indicate 
where the approach followed for the Abelian gauge field A" in section 7.3.2 
fails; and then to show how the assumption (nevertheless) that the Feyn- 
man rules we have established for tree graphs work for loops as well, leads 
to violations of unitarity. This calculation will indicate a very curious way of 
remedying the situation ‘by hand’, through the introduction of ghost particles, 
only present in loops. 


13.3.3 Quantizing non-Abelian gauge fields 


We consider for definiteness the SU(2) gauge theory with massless gauge fields 


w" (x), which we shall call gluons, by a slight abuse of language. We try to 
carry through for the Yang-Mills Lagrangian 


, ist, tus 
£y— -3P w FU. (13.102) 

where 2 I : y : 
F,—0,W,—98,W,-9W, x W,, (13.103) 


the same steps we followed for the Maxwell one in section 7.3.2. 
We begin by re-formulating the prescription arrived at in (7.119), which 
we reproduce again here for convenience: 
1 


A Re oR 1 K 
Le = Soha — gg CAY. (13.104) 


£s leads to the equation of motion 


Â! — ata, AY + zoro" =0. (13.105) 
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This has the drawback that the limit £ — 0 appears to be singular (though the 
propagator (7.122) is well-behaved as £ — 0). To avoid this unpleasantness, 
consider the Lagrangian (Lautrup 1967) 


, does ck ee eee ny 
Les = -FuF" + Bô, A" + ce (13.106) 


where B is a scalar field. We may think of the : BO- Â’ term as a field theory 
analogue of the procedure followed in classical Lagrangian mechanics, whereby 
a constraint (in this case the gauge-fixing one ô- A = 0) is brought into the 
Lagrangian with a ‘Lagrange multiplier’ (here the auziliary field B). The 
momentum conjugate to A? is now 


7° =B (13.107) 


while the Euler-Lagrange equations for At” read 


A" — 9^0, A” = 0" B, (13.108) 


and for B yield . . 
8,À" - £B — 0. (13.109) 


Eliminating B from (13.106) by means of (13.109) we recover (13.104). Taking 
O, of (13.108) we learn that B = 0, so that B is a free massless field. 
Applying O to (13.109) then shows that CQ, À^ = 0, so that 9, À" is also a 
free massless field. 

In this formulation, the appropriate subsidiary condition for getting rid of 
the unphysical (non-transverse) degrees of freedom is (cf (7.111)) 


BC? (a) | Y) = 0. (13.110) 


Kugo and Ojima (1979) have shown that (13.110) provides a satisfactory def- 
inition of the Hilbert space of states. In addition to this it is also essential to 
prove that all physical results are independent of the gauge parameter £. 
We now try to generalize the foregoing in a straightforward way to (13.102). 
The obvious analogue of (13.106) would be to consider 
^ 12 ^ uv ^ nm [pane A 
Loe po -oFu, FU + B-(,NW") c 56B. B (13.111) 
where B is an SU(2) triplet of scalar fields. Equation (13.111) gives (cf 
(13.108)) . : . 
(D”) i; Fin +3 Êi = 0 (13.112) 


where the covariant derivative is now the one appropriate to the SU(2) triplet 
F,, (see (13.44) with t = 1, and (12.48)), and i,j are the SU(2) labels. 
Similarly, (13.109) becomes 


8,W" «eB — 0. (13.113) 
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It is possible to verify that 
(D")gi(D”)ig Pju = 0 (13.114) 
where i, j,k are the SU(2) matrix indices, which implies that 
(D"),;8,B; = 0. (13.115) 


This is the crucial result: it implies that the auxiliary field B is not a free 
field in this non-Abelian case, and so neither (from (13.113)) is 0, W". In 
consequence, the obvious generalizations of (7.108) or (13.110) cannot be used 
to define the physical (transverse) states. The reason is that a condition like 
(13.110) must hold for all times, and only if the field is free is its time variation 
known (and essentially trivial). 

Let us press ahead nevertheless, and assume that the rules we have derived 
so far are the correct Feynman rules for this gauge theory. We will see that 
this leads to physically unacceptable consequences, namely to the violation of 
unitarity. 

In fact, this is a problem which threatens all gauge theories if the gauge 
field is treated covariantly, i.e. as a 4-vector. As we saw in section 7.3.2, this 
introduces unphysical degrees of freedom which must somehow be eliminated 
from the theory, or at least prevented from affecting physical processes. In 
QED we do this by imposing the condition (7.111), or (13.110), but as we 
have seen the analogous conditions will not work in the non-Abelian case, and 
so unphysical states may make their presence felt, for example in the ‘sum 
over intermediate states’ which arises in the unitarity relation. This relation 
determines the imaginary part of an amplitude via an equation of the form 
(cf (11.65)) 


2 Im (f | M | i) = [321 Mami Lt idp (13.116) 


where (f | M | i) is the (Feynman) amplitude for the process i > f, and 
the sum is over a complete set of physical intermediate states | n), which 
can enter at the given energy; dpn represents the phase space element for 
the general intermediate state | n). Consider now the possibility of gauge 
quanta appearing in the states | n). Since unitarity deals only with physical 
states, such quanta can have only the two degrees of freedom (polarizations) 
allowed for a physical massless gauge field (cf section 7.3.1). Now part of the 
power of the ‘Feynman rules’ approach to perturbation theory is that it is 
manifestly covariant. But there is no completely covariant way of selecting 
out just the two physical components of a massless polarization vector €,,, 
from the four originally introduced precisely for reasons of covariance. In 
fact, when gauge quanta appear as virtual particles in intermediate states in 
Feynman graphs, they will not be restricted to having only two polarization 
states (as we shall see explicitly in a moment). Hence there is a real chance 
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FIGURE 13.9 
Two-gluon intermediate state in the unitarity relation for the amplitude for 


qq qq. 


that when the imaginary part of such graphs is calculated, a contribution from 
the unphysical polarization states will be found, which has no counterpart at 
all in the physical unitarity relation, so that unitarity will not be satisfied. 
Since unitarity is an expression of conservation of probability, its violation is 
a serious disease indeed. 

Consider, for example, the process qq — qq (where the ‘quarks’ are an 
SU(2) doublet), whose imaginary part has a contribution from a state con- 
taining two gluons (figure 13.9): 


21m (aq | M ad) = J S~a | M | ge)(gg | MÌ | aa)dp2 (13.117) 


where dp» is the 2-body phase space for the g-g state. The 2-gluon amplitudes 
in (13.117) must have the form 


Myvi et (ki, A1)€5' (k2, A2) (13.118) 


where e” (k, A) is the polarization vector for the gluon with polarization A and 
4-momentum k. The sum in (13.117) is then to be performed over Ay = 1,2 
and A» — 1,2 which are the physical polarization states (cf section 7.3.1). 
Thus (13.117) becomes 


Zim Masa = |O Y) Mane Ud (ha 22) 


Ai=1,2;A2=1,2 
x M7 e (ki, A1)€5? (k2, Az) dpa. (13.119) 


H2v2 


For later convenience we are using real polarization vectors as in (7.81) and 
(7.82): elki, A; = +1) = (0,1,0,0), e(ki, A; = —1) = (0,0,1,0); and of course 
k? = k2 =0. 

We now wish to find out whether or not a result of the form (13.119) 
will hold when the M’s represent some suitable Feynman graphs. We first 
note that we want the unitarity relation (13.119) to be satisfied order by 
order in perturbation theory: that is to say, when the M’s on both sides are 
expanded in powers of the coupling strengths (as in the usual Feynman graph 
expansion), the coefficients of corresponding powers on each side should be 
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FIGURE 13.10 
Some O(g*) contributions to qq — qq. 


equal. Since each emission or absorption of a gluon produces one power of the 
SU(2) coupling g, the right-hand side of (13.119) involves at least the power 
g^. Thus the lowest-order process in which (13.119) may be tested is for the 
(4) 


fourth-order amplitude M qq. 
4) 


to NAS raa some of which are shown in Figure 13.10; all contain a loop. On 
the right-hand side of (13.119), each M involves two polarization vectors, and 


There are quite a number of contributions 


so each must represent the 0(g?) contribution to qq — gg, which we call MQ 
thus both sides are consistently of order g^. There are three contributions to 
M shown in figure 13.11; when these are placed in (13.119), contributions 
to the imaginary part of M q are generated, which should agree with the 
imaginary part of the total 0(g*) loop-graph contribution. Let us see if this 
works out. We choose to work in the gauge £ — 1, so that the gluon propagator 
takes the familiar form —ig/"6;;/k?. According to the rules for propagators 
and vertices already given, each of the loop amplitudes MU ss (e.g. those 
of figure 13.10) will be proportional to the product of the propagators for the 
quarks and the gluons, together with appropriate ‘y’ and ‘T’ vertex factors, 
the whole being integrated over the loop momentum. The extraction of the 
imaginary part of a Feynman diagram is a technical matter, having to do with 
careful consideration of the ‘ie’ in the propagators. Rules for doing this exist 
(Eden et al. 1966, section 2.9), and in the present case the result is that, to 
compute the imaginary part of the amplitudes of figure 13.10, one replaces 
each gluon propagator of momentum k by 


1(—g"")é(k?)0(ko)ó;;. (13.120) 


'That is, the propagator is replaced by a condition stating that, in evaluating 
the imaginary part of the diagram, the gluon's mass is constrained to have 
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FIGURE 13.11 
O(g?) contributions to qq > gg. 


the physical (free-field) value of zero, instead of varying freely as the loop 
momentum varies, and its energy is positive. These conditions (one for each 
gluon) have the effect of converting the loop integral with a standard two-body 
phase space integral for the gg intermediate state, so that eventually 


2Im M s = [MeL CM. Contes (13.121) 
where MO, is the sum of the three O(g?) tree graphs shown in figure 13.11, 
with all external legs satisfying the ‘mass-shell’ conditions. 
So, the imaginary part of the loop contribution to MIS paa does seem to 
have the form (13.116) as required by unitarity, with |n) the gg intermediate 
state as in (13.119). But there is one essential difference between (13.121) and 
(13.119): the place of the factor —g"" in (13.121) is taken in (13.119) by the 


gluon polarization sum 


P'"(k) = V (k, X)e"(k, A) (13.122) 
A-1,2 


for k = kı, k2 and A = A1, A2 respectively. Thus we have to investigate whether 
this difference matters. 

To proceed further, it is helpful to have an explicit expression for P"". We 
might think of calculating the necessary sum over A by brute force, using two 
€'s specified by the conditions (cf (7.87)) 


e"(k, A)e,(k, V) =a, | e-k— 0. (13.123) 


The trouble is that conditions (13.123) do not fix the €s uniquely if k? = 
0. (Note the 6(k?) in (13.120)). Indeed, it is precisely the fact that any 
given e,, satisfying (13.123) can be replaced by e,, + Ak, that both reduces 
the degrees of freedom to two (as we saw in section 7.3.1), and evinces the 
essential arbitrariness in the e€, specified only by (13.123). In order to calculate 
(13.122), we need to put another condition on e,, so as to fix it uniquely. A 
standard choice (see e.g. Taylor 1976, pp 14-15) is to supplement (13.123) 
with the further condition 

t-e=0 (13.124) 
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where t is some 4-vector. This certainly fixes e,, and enables us to calculate 
(13.122), but of course now two further difficulties have appeared: namely, the 
physical results seem to depend on t,,; and have we not lost Lorentz covariance, 
because the theory involves a special 4-vector t,,? 

Setting these questions aside for the moment, we can calculate (13.122) 
using the conditions (13.123) and (13.124), finding (problem 13.11) 


Pav = —Guv — [t kuku — k - t(kyty + kvtu)]/(k t)?. (13.125) 


But only the first term on the right-hand side of (13.125) is to be seen in 
(13.121). A crucial quantity is clearly 


U,,(k,t) = -—9, — Pw 
K kuku — k- t(kuty + kyty)|/(k- t)?. (13.126) 


We note that whereas 
k" P, =k’ Pw = 0 (13.127) 


(from the condition k- e = 0), the same is not true of k“U „y — in fact, 
k"U,, = —k, (13.128) 


where we have used k? = 0. It follows that U,,, may be regarded as including 
polarization states for which e- k # 0. In physical terms, therefore, a gluon 
appearing internally in a Feynman graph has to be regarded as existing in more 
than just the two polarization states available to an external gluon (cf section 
7.3.1). U,, characterizes the contribution of these unphysical polarization 
states. 
The discrepancy between (13.121) and (13.119) is then 
21m MO us = J M, [U (ky, ti] M, [U (ko, t2)]dp2, (13.129) 
together with similar terms involving one P and one U. It follows that these 
unwanted contributions will, in fact, vanish if 
k^ MO, =0, (13.130) 


Hiv 


and similarly for k2. This will also ensure that amplitudes are independent of 
is 

Condition (13.130) is apparently the same as the U(1) gauge invariance 
requirement of (8.165), already recalled in the previous section. As discussed 
there, it can be interpreted here also as expressing gauge invariance in the 
non-Abelian case, working to this given order in perturbation theory. Indeed, 
the diagrams of figure 13.11 are essentially ‘crossed’ versions of those in figure 
13.5. However, there is one crucial difference here. In figure 13.5, both the 
X’s were physical, their polarizations satisfying the condition e-k = 0. In 
figure 13.11, by contrast, neither of the gluons, in the discrepant contribution 
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(13.129), satisfies € - k = 0 — see the sentence following (13.128). Thus the 
crucial point is that (13.130) must be true for each gluon, even when the other 
gluon has e-k #0. And, in fact, we shall now see that whereas the (crossed) 
version of (13.130) did hold for our dX — dX amplitudes of section 13.3.2, 
(13.130) fails for states with e€- k zz 0. 

The three graphs of figure 13.11 together yield 


MO, e f (ki, Ares" (k2, A2) = ?v(p 22 i mI Eu ari giu(pı) 
z Ta 1 Ti 
25 g^ v(p3)-, asi Ei e fou(pi) 
+ (—i)g7€xij[(p1 + pa + k1)^ g^? + (ka — pi — p2) g” 
y —1 - Tk 
*o(-EpTEQSg" "eius ariazjeon Ta Pa) y put) (13.131) 


where we have written the gluon polarization vectors as a product of a Lorentz 
4-vector e,, and an ‘SU(2) polarization vector’ a; to specify the triplet state 
label. Now replace «1, say, by kı. Using the Dirac equation for u(p1) and 
U(p2) the first two terms reduce to (cf (13.94)) 


*0(p2) £a[ri/2, 7;/2]u(p1)a1;a2; 


= ig?0(p2) foci (Tk /2)u(p1)a1;22; (13.132) 

using the SU(2) algebra of the 7's. The third term in (13.131) gives 
—ig^eijkv(pa) ¢2(Te/2)u(p1 arian; (13.133) 
+ig? x Y v(p3) Kı(Tk/2)u(pı)k2 + €2a1;a2j. (13.134) 


We see that the first part (13.133) certainly does cancel (13.132), but there 
remains the second piece (13.134), which only vanishes if ko €2 = 0. This is 
not sufficient to guarantee the absence of all unphysical contributions to the 
imaginary part of the 2-gluon graphs, as the preceding discussion shows. We 
conclude that loop diagrams involving two (or, in fact, more) gluons, if con- 
structed according to the simple rules for tree diagrams, will violate unitarity. 

The correct rule for such loops must be as to satisfy unitarity. Since there 
seems no other way in which the offending piece in (13.134) can be removed, 
we must infer that the rule for loops will have to involve some extra term, or 
terms, over and above the simple tree-type constructions, which will cancel 
the contributions of unphysical polarization states. To get an intuitive idea 
of what such extra terms might be, we return to expression (13.126) for the 
sum over unphysical polarization states U,,, and make a specific choice for 
t. We take t,, = ku, where the 4-vector k is defined by k = (— | k |, k), and 
k = (0,0,| k |). This choice obviously satisfies (13.124). Then 


Uu (k, k) = (kyky + k,k,)/(2 | k |?) (13.135) 
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and unitarity (cf (13.129)) requires 


[me MO (KI KI? + REPKE) (EP EP + kpky) 
m 


d 13.136 
1⁄1 H2V2 2 | kı |2 2 | ko |2 p2 ( ) 


to vanish, but it does not. Let us work in the centre of momentum (CM) frame 
of the two gluons, with kı = (| k |,0,0,| k |), k2 = (| k |,0,0,- | k |), kı = 
(— | k |,0,0,| k |), k2 = (— | k , 0,0, — | k |), and consider for definiteness 
the contractions with the MO, term. These are MO, ki E ME, kt I 
etc. Such quantities can be calculated from expression (13.131) by setting 
€; = ky, €2 = ke for the first, e = k1, €2 = kə for the second, and so on. We 
have already obtained the result of putting eq = kı. From (13.134) it is clear 
that a term in which e» is replaced by kz as well as €, by kı will vanish, since 
k2 = 0. A typical non-vanishing term is of the form MẸ), k/^ k^ /2 | k [2. 
From (13.134) this reduces to 


Eijk 


2kı + ko 


-ig? v(pz) Ka(ri/2)u(pi)a1;a25 (13.137) 


using kə- k2/2 | k |?— —1. We may rewrite (13.137) as 


Lg 
Juk Te pRrécuntiaa ks (13.138) 


where 
Juk = 90(P2)Vu(Th/2)u(pr) (13.139) 


is the SU(2) current associated with the qq pair. 

The unwanted terms of the form (13.138) can be eliminated if we adopt 
the following rule (on the grounds of ‘forcing the theory to make sense’). 
In addition to the fourth-order diagrams of the type shown in figure 13.10, 
constructed according to the simple ‘tree’ prescriptions, there must exist a 
previously unknown fourth-order contribution, only present in loops, such that 
it has an imaginary part which is non-zero in the same physical region as the 
two-gluon intermediate state, and moreover is of just the right magnitude to 
cancel all the contributions to (13.136) from terms like (13.138). Now (13.138) 
has the appearance of a one-gluon intermediate state amplitude. The qq — g 
vertex is represented by the current (13.139), the gluon propagator appears 
in Feynman gauge € = 1, and the rest of the expression would have the 
interpretation of a coupling between the intermediate gluon and two scalar 
particles with SU(2) polarizations a4;,a2;. Thus (13.138) can be interpreted 
as the amplitude for the tree graph shown in figure 13.12, where the dotted 
lines represent the scalar particles. It seems plausible, therefore, that the 
fourth-order graph we are looking for has the form shown in figure 13.13. 
The new scalar particles must be massless, so that this new amplitude has 
an imaginary part in the same physical region as the gg state. When the 
imaginary part of figure 13.13 is calculated in the usual way, it will involve 
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q ^ ghost 


q E^ ghost 


FIGURE 13.12 
Tree graph interpretation of the expression (13.138). 


ghost 


FIGURE 13.13 
Ghost loop diagram contributing in fourth order to qq — qq. 


contributions from the tree graph of figure 13.12, and these can be arranged 
to cancel the unphysical polarization pieces like (13.138). 

For this cancellation to work, the scalar particle loop graph of figure 13.13 
must enter with the opposite sign from the three-gluon loop graph of figure 
13.10, which in retrospect was the cause of all the trouble. Such a relative 
minus sign between single closed loop graphs would be expected if the scalar 
particles in figure 13.13 were in fact fermions! (Recall the rule given in section 
11.3 and problem 11.2). Thus we appear to need scalar particles obeying Fermi 
statistics. Such particles are called ‘ghosts’. We must emphasize that although 
we have introduced the tree graph of figure 13.12, which apparently involves 
ghosts as external lines, in reality the ghosts are always confined to loops, their 
function being to cancel unphysical contributions from intermediate gluons. 

'The preceding discussion has, of course, been entirely heuristic. It can 
be followed through so as to yield the correct prescription for eliminating 
unphysical contributions from a single closed gluon loop. But, as Feynman 
recognized (1963, 1977), unitarity alone is not a sufficient constraint to provide 
the prescription for more than one closed gluon loop. Clearly what is required 
is some additional term in the Lagrangian, which will do the job in general. 
Such a term indeed exists, and was first derived using the path integral form 
of quantum field theory (see chapter 16) by Faddeev and Popov (1967). The 
result is that the covariant gauge-fixing term (13.68) must be supplemented 
by the ‘ghost Lagrangian’ 


Êg = 0,0] Dish; (13.140) 


t 


where the 77 field is an SU(2) triplet, and spinless, but obeying anticommutation 
relations; the covariant derivative is the one appropriate for an SU(2) triplet, 
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namely (from (13.44) and (12.48)) 
Di = O Sij + gei WE, (13.141) 


in this case. The result (13.140) is derived in standard books of quantum 
field theory, for example Cheng and Li (1984), Peskin and Schroeder (1995) 
or Ryder (1996). We should add the caution that the form of the ghost 
Lagrangian depends on the choice of the gauge-fixing term; there are gauges 
in which the ghosts are absent. Feynman rules for non-Abelian gauge field 
theories are given in Cheng and Li (1984), for example. We give the rules for 
tree diagrams, for which there are no problems with ghosts, in appendix Q. 


—— M —— —<$—_—_——— ‘g 
Problems 

13.1 Verify that (13.34) reduces to (13.26) in the infinitesimal case. 

13.2 Verify equation (13.45). 

13.3 Using the expression for D" in (13.47), verify (13.48). 


13.4 Verify the transformation law (13.51) of F"" under local SU(2) trans- 
formations. 


13.5 Verify that P, - F"" is invariant under local SU(2) transformations. 


13.6 Verify that the (infinitesimal) transformation law (13.56) for the SU(3) 
gauge field A^" is consistent with (13.55). 


13.7 By considering the commutator of two D''s of the form (13.53), verify 
(13.61). 


13.8 Verify that (13.84) reduces to (13.86) (omitting the (27)*6* factors). 
13.9 Verify that the replacement of e by kı in (13.93) leads to (13.94). 


13.10 Verify that when e; is replaced by kı in (13.95), the resulting amplitude 
cancels the contribution (13.94), provided that ô = 1. 


13.11 Show that P"" of (13.122), with the e’s specified by the conditions 
(13.123) and (13.124), is given by (13.125). 
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QCD I: Introduction, Tree Graph 
Predictions, and Jets 


In the previous chapter we have introduced the elementary concepts and for- 
malism associated with non-Abelian quantum gauge field theories. It is now 
well established that the strong interactions between quarks are described by 
a theory of this type, in which the gauge group is an SU(3)., acting on a 
degree of freedom called ‘colour’ (indicated by the subscript c). This theory 
is called Quantum Chromodynamics, or QCD for short. QCD will be our first 
application of the theory developed in chapter 13, and we shall devote the 
next two chapters, and much of chapter 16, to it. 


In the present chapter we introduce QCD and discuss some of its simpler 
experimental consequences. We briefly recall the evidence for the ‘colour’ de- 
gree of freedom in section 14.1, and then proceed to the dynamics of colour, 
and the QCD Lagrangian, in section 14.2. Perhaps the most remarkable thing 
about the dynamics of QCD is that, despite its being a theory of the strong 
interactions, there are certain kinematic regimes — roughly speaking, short dis- 
tances or high energies — in which it is effectively a quite weakly interacting the- 
ory. This is a consequence of a fundamental property, possessed only by non- 
Abelian gauge theories, whereby the effective interaction strength becomes 
progressively smaller in such regimes. This property is called ‘asymptotic 
freedom', and was already mentioned in section 11.5.3 of volume 1. In appro- 
priate cases, therefore, the lowest-order perturbation theory amplitudes (tree 
graphs) provide a very convincing qualitative, or even ‘semi-quantitative’, ori- 
entation to the data. In sections 14.3 and 14.4 we shall see how the tree graph 
techniques acquired for QED in volume 1 produce more useful physics when 
applied to QCD. 


However, most of the quantitative experimental support for QCD has come 
from comparison with predictions which include higher-order QCD correc- 
tions; indeed, the asymptotic freedom property itself emerges from summing a 
whole class of higher-order contributions, as we shall indicate at the beginning 
of chapter 15. This immediately involves all the apparatus of renormalization. 
'The necessary calculations quite rapidly become too technical for the intended 
scope of this book, but in chapter 15 we shall try to provide an elementary in- 
troduction to the issues involved, and to the necessary techniques, by building 
on the discussion of renormalization given in chapters 10 and 11 of volume 1. 
The main new concept will be the renormalization group (and related ideas), 
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which is an essential tool in the modern confrontation of perturbative QCD 
with data. Some of the simpler predictions of the renormalization group tech- 
nique will be compared with experimental data in the last part of chapter 
15. 

In chapter 16 we work towards understanding some non-perturbative as- 
pects of QCD. As a natural concomitant of asymptotic freedom, it is to be 
expected that the effective coupling strength becomes progressively larger at 
longer distances or lower energies, ultimately being strong enough to lead 
(presumably) to the confinement of quarks and gluons; this is sometimes re- 
ferred to as ‘infrared slavery’. In this regime perturbation theory clearly fails. 
An alternative, purely numerical, approach is available however, namely the 
method of ‘lattice’ QCD, which involves replacing the space-time continuum 
by a discrete lattice of points. At first sight, this may seem a topic rather 
disconnected from everything that has preceded it. But we shall see that in 
fact it provides some powerful new insights into several aspects of quantum 
field theory in general, and in particular of renormalization, by revisiting it in 
coordinate (rather than momentum) space. Quite apart from this, however, 
results from lattice QCD now provide independent confirmation of the theory, 
in the non-perturbative regime. 


E: SeSe 


14.1 The colour degree of freedom 


The first intimation of a new, unrevealed degree of freedom of matter came 
from baryon spectroscopy (Greenberg 1964; see also Han and Nambu 1965, 
and Tavkhelidze 1965). For a baryon made of three spin-i quarks, the original 
non-relativistic quark model wave-function took the form 


V3q ^Y VU3q,space U3q,spin U3q,flavour- (14.1) 


It was soon realized (e.g. Dalitz 1965) that the product of these space, spin 
and flavour wavefunctions for the ground state baryons was symmetric under 
interchange of any two quarks. For example, the A** state mentioned in 
section 12.2.3 is made of three u quarks (flavour symmetric) in the J? = 
uU state, which has zero orbital angular momentum and is hence spatially 
symmetric, and a symmetric S = 3 spin wavefunction. But we saw in section 
7.2 that quantum field theory requires fermions to obey the exclusion principle 
- ie. the wavefunction v34 should be antisymmetric with respect to quark 
interchange. A simple way of implementing this requirement is to suppose 
that the quarks carry a further degree of freedom, called colour, with respect 
to which the 3q wavefunction can be antisymmetrized, as follows (Fritzsch 
and Gell-Mann 1972, Bardeen, Fritzsch and Gell-Mann 1973). We introduce 


a colour wavefunction with colour index a: 


va (a=1,2,3). 
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We are here writing the three labels as ‘1, 2, 3’, but they are often referred to 
by colour names such as ‘red, blue, green’; it should be understood that this 
is merely a picturesque way of referring to the three basic states of this degree 
of freedom, and has nothing to do with real colour! With the addition of this 
degree of freedom we can certainly form a three-quark wavefunction which is 
antisymmetric in colour by using the antisymmetric symbol €q-, namely! 


W3q, colour — €o 8» Papey (14.2) 


and this must then be multiplied into (14.1) to give the full 3q wavefunction. 
To date, all known baryon states can be described this way, i.e. the symmetry 
of the ‘traditional’ space-spin-flavour wavefunction (14.1) is symmetric overall, 
while the required antisymmetry is restored by the additional factor (14.2). As 
far as meson (qq) states are concerned, what was previously a m+ wavefunction 
d*u is now i 

which we write in general as (1/v3)dłua. We shall shortly see the group 
theoretical significance of this ‘neutral superposition’, and of (14.2). Mean- 
while, we note that (14.2) is actually the only way of making an antisymmetric 
combination of the three ~’s; it is therefore called a (colour) singlet. It is re- 
assuring that there is only one way of doing this — otherwise, we would have 
obtained more baryon states than are physically observed. As we shall see in 
section 14.2.1, (14.3) is also a colour singlet combination. 

The above would seem a somewhat artificial device unless there were some 
physical consequences of this increase in the number of quark types — and there 
are. In any process which we can describe in terms of creation or annihilation 
of quarks, the multiplicity of quark types will enter into the relevant observable 
cross section or decay rate. For example, at high energies the ratio 


te^ — had 
R- c(e*e- — hadrons) (14.4) 
c(e*e- > pty) 


will, in the quark parton model (see section 9.5), reflect the magnitudes of the 
individual quark couplings to the photon: 


Hc (14.5) 


where a runs over all quark types. For five quarks u, d, s, c, b with respective 
charges 2, 4, —5, $, — $, this yields 


11 
Fio colour — g (14.6) 


1Tn (14.2) each y refers to a different quark, but we have not indicated the quark labels 
explicitly. 
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FIGURE 14.1 
The ratio R (see (14.4)). Figure reprinted with permission from L. Montanet 
et al. Physical Review D 50 1173 (1994). Copyright 1994 by the American 
Physical Society. 


and ‘a 
Reolour = 3 (14.7) 


for the two cases, as we saw in section 9.5. (The values R = 2 below the charm 
threshold, and R = 10/3 below the b threshold, were predicted by Bardeen et 
al. 1973). The data (figure 14.1) rule out (14.6), and are in good agreement 
with (14.7) at energies well above the b threshold, and well below the Z° 
resonance peak. There is an indication that the data tend to lie above the 
parton model prediction; this is actually predicted by QCD via higher-order 
corrections, as will be discussed in section 15.1. 

A number of branching fractions also provide simple ways of measuring 
the number of colours Ne. For example, consider the branching fraction for 
T — e Dev, (i.e. the ratio of the rate for T^ — e^ Hv, to that for all other 
decays). t~ decays proceed via the weak process shown in figure 14.2, where 
the final fermions can be ei, u` Dp, or tid, the last with multiplicity Ne. 
Thus 


1 
B(r —e Derr) S TES (14.8) 
Experiments give B ~ 18 96 and hence Ne z 3. 
Similarly, the branching fraction B(W-^ — e^) is ~ SEN: (from f = 


e, ii, T, u and c). Experiment gives a value of 10.7 96, so again Ne z 3. 

In chapter 9 we also discussed the Drell-Yan process in the quark parton 
model; it involves the subprocess qq — ll which is the inverse of the one in 
(14.4). We mentioned that a factor of 4 appears in this case: it arises because 
we must average over the nine possible initial qq combinations (factor i) 
and then sum over the number of such states that lead to the colour neutral 
photon, which is 3 (Giq1, G2q2 and G3q3). With this factor, and using quark 
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FIGURE 14.2 
T decay. 


FIGURE 14.3 
Triangle graph for 7° decay. 


distribution functions consistent with deep inelastic scattering, the parton 
model gives a good first approximation to the data. 

Finally, we mention the rate for 7° — yy. As will be discussed in section 
18.4, this process is entirely calculable from the graph shown in figure 14.3 
(and the one with the y’s ‘crossed’), where ‘q’ is u or d. The amplitude is 
proportional to the square of the quark charges, but because the 7° is an 
isovector, the contributions from the uü and dd states have opposite signs 
(see section 12.1.3). Thus the rate contains a factor 


((2/8)? - (1/3)?)? = =. (14.9) 


However, the original calculation of this rate by Steinberger (1949) used a 
model in which the proton and neutron replaced the u and d in the loop, in 
which case the factor corresponding to (14.9) is just 1 (since the n has zero 
charge). Experimentally the rate agrees well with Steinberger’s calculation, 
indicating that (14.9) needs to be multiplied by 9, which corresponds to Ne = 3 
identical amplitudes of the form shown in figure 14.3, as was noted by Bardeen, 
Fritzsch and Gell-Mann (1973). 
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eS ee 
14.2 The dynamics of colour 
14.2.1 Colour as an SU(3) group 


We now want to consider the possible dynamical role of colour — in other 
words, the way in which the forces between quarks depend on their colours. 
We have seen that we seem to need three different quark types for each given 
flavour. They must all have the same mass, or else we would observe some 
‘fine structure’ in the hadronic levels. Furthermore, and for the same reason, 
‘colour’ must be an exact symmetry of the Hamiltonian governing the quark 
dynamics. What symmetry group is involved? We shall consider how some 
empirical facts suggest that the answer is SU(3),. 

To begin with, it is certainly clear that the interquark force must depend 
on colour, since we do not observe ‘colour multiplicity’ of hadronic states: for 
example we do not see eight other coloured 7*’s (dius, d$ui, ...) degenerate 
with the one ‘colourless’ physical m whose wavefunction was given previ- 
ously. The observed hadronic states are all colour singlets, and the force must 
somehow be responsible for this. More particularly, the force has to produce 
only those very restricted types of quark configuration which are observed in 
the hadron spectrum. Consider again the isospin multiplets in nuclear physics 
discussed in section 12.1.2. There is one very striking difference in the par- 
ticle physics case: for mesons only T = 0,4 and 1 occur, and for baryons 
only T = 0, i. 1 and 3, while in nuclei there is nothing in principle to stop 
us finding T = 3, 3, ... states. (In fact such nuclear states are hard to iden- 
tify experimentally, because they occur at high excitation energy for some of 
the isobars — cf figure 1.8(c) — where the levels are very dense). The same 
restriction holds for SU(3)¢ also — only 1’s and 8’s occur for mesons; and only 
1’s, 8’s and 10’s for baryons. In quark terms, this of course is what is trans- 
lated into the recipe: ‘mesons are qq, baryons are qqq’. It is as if we said, 
in nuclear physics, that only A = 2 and A = 8 nuclei exist! Thus the quark 
forces must have a dramatic saturation property: apparently no qqq, no qqqq, 
qqqqq, ...states exist. Furthermore, no qq or qq states exist either — nor, for 
that matter, do single q’s or q’s. All this can be summarized by saying that 
the quark colour degree of freedom must be confined, a property we shall now 
assume and return to in chapter 16. 

If we assume that only colour singlet states exist (Fritzsch and Gell-Mann 
1972, Bardeen, Fritzsch and Gell-Mann 1973), and that the strong interquark 
force depends only on colour, the fact that qq states are seen but qq and qq are 
not gives us an important clue as to what group to associate with colour. One 
simple possibility might be that the three colours correspond to the compo- 
nents of an SU(2), triplet ‘eb’. The antisymmetric, colour singlet, three-quark 
baryon wavefunction of (14.2) is then just the triple scalar product sb, Pa x Y3, 
which seems satisfactory. But what about the meson wavefunction? Mesons 
are formed of quarks and antiquarks, and we recall from sections 12.1.3 and 
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12.2 that antiquarks belong to the complex conjugate of the representation (or 
multiplet) to which quarks belong. Thus if a quark colour triplet wavefunction 
WPa transforms under a colour transformation as 


Ya > Vl, = V Yg (14.10) 


where V(® is a 3 x 3 unitary matrix appropriate to the T = 1 representation 
of SU(2) (cf (12.48) and (12.49)), then the wavefunction for the ‘anti’-triplet 
is ws, which transforms as 


vto v = Vs. (14.11) 


Given this information, we can now construct colour singlet wavefunctions for 
mesons, built from qq. Consider the quantity (cf (14.3)) $5, v va where v* 
represents the antiquark and w the quark. This may be written in matrix 
notation as pty where the v! as usual denotes the transpose of the complex 
conjugate of the column vector v. Then, taking the transpose of (14.11), we 
find that ~ transforms by 


yt! > yi = yivot (14.12) 
so that the combination vw transforms as 
yty 2 yy! = pt VOVO = ply (14.13) 


where the last step follows since V™ is unitary (compare (12.58)). Thus the 
product is invariant under (14.10) and (14.11) - that is, it is a colour singlet, 
as required. This is the meaning of the superposition (14.3). 

All this may seem fine, but there is a problem. The three-dimensional 
representation of SU(2). which we are using here has a very special nature: 
the matrix V“ can be chosen to be real. This can be understood ‘physically’ 
if we make use of the great similarity between SU(2) and the group of rota- 
tions in three dimensions (which is the reason for the geometrical language of 
isospin ‘rotations’, and so on). We know very well how real three-dimensional 
vectors transform, namely by an orthogonal 3 x 3 matrix. It is the same in 
SU(2). It is always possible to choose the wavefunctions ~ to be real, and the 
transformation matrix V“ to be real also. Since V() is, in general, unitary, 
this means that it must be orthogonal. But now the basic difficulty appears: 
there is no distinction between s» and w*! They both transform by the real 
matrix V(). This means that we can make SU(2) invariant (colour singlet) 
combinations for qq states, and for qq states, just as well as for qq states — 
indeed they are formally identical. But such ‘diquark’ (or ‘antidiquark’) states 
are not found, and hence — by assumption — should not be colour singlets. 

The next simplest possibility seems to be that the three colours corre- 
spond to the components of an SU(3), triplet. In this case the quark colour 
wavefunction Ya transforms as (cf (12.74)) 


vy — wy = Wy (14.14) 
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where W is a special unitary 3 x 3 matrix parametrized as 
W = exp(ia - A/2), (14.15) 


and v transforms as 
yt 2 pl = yrw'. (14.16) 


The proof of the invariance of yty goes through as in (14.13), and it can be 
shown (problem 14.1(a)) that the antisymmetric 3q combination (14.2) is also 
an SU(3). invariant. Thus both the proposed meson and baryon states are 
colour singlets. It is not possible to choose the A's to be pure imaginary in 
(14.15), and thus the 3x 3 W matrices of SU(3)c cannot be real, so that there 
is a distinction between w and v*, as we learned in section 12.2. Indeed, it 
can be shown (see Carruthers 1966, chapter 3, Jones 1990, chapter 8, and also 
problem 14.1(b)) that, unlike the case of SU(2). triplets, it is not possible to 
form an SU(3), colour singlet combination out of two colour triplets qq or 
anti-triplets qq. Thus SU(3). seems to be a possible and economical choice 
for the colour group. 


14.2.2 Global SU(3), invariance, and ‘scalar gluons’ 


As stated above, we are assuming, on empirical grounds, that the only phys- 
ically observed hadronic states are colour singlets — and this now means sin- 
glets under SU(3).. What sort of interquark force could produce this dramatic 
result? Consider an SU(2) analogy again, the interaction of two nucleons be- 
longing to the lowest (doublet) representation of SU(2). Labelling the states 
by an isospin T, the possible T values for two nucleons are T' = 1 (triplet) and 
T = 0 (singlet). We know of an isospin-dependent force which can produce a 
splitting between these states, namely VT 1-72, where the ‘1’ and ‘2’ refer to 
the two nucleons. The total isospin is T' = i(ri + T3), and we have 


1 1 
T? = Ari + 271-72 +73) = O 27i 72 +3) (14.17) 
whence 
Ti: T2 = 2T? — 3. (14.18) 
In the triplet state T? = 2, and in the singlet state T? = 0. Thus 


(Ti:T2)7231 = 1 (14.19) 
(Ti -T2)r=0 = —3 (14.20) 


and if V is positive the T' = 0 state is pulled down. A similar thing happens 
in SU(3).. Suppose this interquark force depended on the quark colours via 
a term proportional to 

Ai: A2. (14.21) 
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Then, in just the same way, we can introduce the total colour operator 


1 
F- 30 T A3), (14.22) 
so that 1 
F? = 108 + 2A, - A2 + A2) (14.23) 
and 
At Ag = 2F? — X7, (14.24) 
where A? = A2 = A?, say. Here A? = SG lhe) is found (see (12.75)) to 


have the value 16/3 (the unit matrix being understood). The operator F? 
commutes with all components of A; and Ag (as T? does with T, and T2) 
and represents the quadratic Casimir operator C5 of SU(3). (see section M.5 
of appendix M), in the colour space of the two quarks considered here. The 
eigenvalues of C2 play a very important role in SU(3),, analogous to that of the 
total spin/angular momentum in SU(2). They depend on the SU(3), repre- 
sentation: indeed, they are one of the defining labels of SU(3) representations 
in general (see section M.5). Two quarks, each in the representation 3e, com- 
bine to give a 6,-dimensional representation and a 37 (see problem 14.1(b), 
and Jones (1990) chapter 8). The value of C2 for the singlet 6, representation 
is 10/3, and for the 3% representation is 4/3. Thus the ‘A; - Aq’ interaction 
will produce a negative (attractive) eigenvalue -8/3 in the 37 states, but a 
repulsive eigenvalue +4/3 in the 6, states, for two quarks. 

The maximum attraction will clearly be for states in which F° is zero. 
This is the singlet representation 1.. Two quarks cannot combine to give 
a colour singlet state, but we have seen in section 12.2 that a quark and an 
antiquark can: they combine to give 1, and 8e. In this case (14.24) is replaced 
by 


1 
Ai: Ao = 2F? — 30 + 3), (14.25) 


where ‘1’ refers to the quark and ‘2’ to the antiquark. Thus the ‘A, - A2? 
interaction will give a repulsive eigenvalue +2/3 in the 8, channel, for which 
Co = 3, and a ‘maximally attractive’ eigenvalue -16/3 in the 1e channel, for 
a quark and an antiquark. 

In the case of baryons, built from three quarks, we have seen that when 
two of them are coupled to the 37 state, the eigenvalue of Aq - Ag is -8/3, one 
half of the attraction in the qq colour singlet state, but still strongly attractive. 
The (qq) pair in the 37 state can then couple to the remaining third quark to 
make the overall colour singlet state (14.2), with maximum binding. 

Of course, such a simple potential model does not imply that the energy 
difference between the 1, states and all coloured states is infinite, as our 
strict ‘colour singlets only’ hypothesis would demand, and which would be 
one (rather crude) way of interpreting confinement. Nevertheless, we can ask: 
what single particle exchange process between quark (or antiquark) colour 
triplets produces a A1: Ag type of term? The answer is the exchange of 
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FIGURE 14.4 
Scalar gluon exchange between two quarks. 


an SU(3). octet (8.) of particles, which (anticipating somewhat) we shall call 
gluons. Since colour is an exact symmetry, the quark wave equation describing 
the colour interactions must be SU(3), covariant. A simple such equation is 


(i d—m)p = 9 ALD (14.26) 


where gs is a ‘strong charge’ and A, (a = 1, 2, ..., 8) is an octet of scalar 
‘gluon potentials’. Equation (14.26) may be compared with (13.58): in the 
latter, A, appears on the right-hand side, because the gauge field quanta 
are vectors rather than scalars. In (14.26), we are dealing at this stage only 
with a global SU(3) symmetry, not a local SU(3) gauge symmetry, and so the 
potentials may be taken to be scalars, for simplicity. As in (13.60), the vertex 
corresponding to (14.26) is 

—igs\a/2. (14.27) 


(14.27) differs from (13.60) simply in the absence of the y” factor, due to 
the assumed scalar, rather than vector, nature of the ‘gluon’ here. When we 
put two such vertices together and join them with a gluon propagator (figure 
14.4), the SU(3). structure of the amplitude will be 
Mas Am ÀLA 
aa ERE RED, 
the dap arising from the fact that the freely propagating gluon does not change 
its colour. This interaction has exactly the required ‘A; - A2?! character in the 
colour space. 


(14.28) 


14.2.3 Local SU(3), invariance: the QCD Lagrangian 


It is tempting to suppose (Fritzsch and Gell-Mann 1972, Fritzsch, Gell-Mann 
and Leutwyler 1973) that the ‘scalar gluons’ introduced in (14.26) are, in fact, 
vector particles, like the photons of QED. Equation (14.26) then becomes 


(i J- my = 9. Aay (14.29) 
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as in (13.58 ), and the vertex (14.27) becomes 


-ig en (14.30) 
as in (13.60). One motivation for this is the desire to make the colour dynamics 
as much as possible like the highly successful theory of QED, and to derive 
the dynamics from a gauge principle. As we have seen in the last chapter, this 
involves the simple but deep step of supposing that the quark wave equation 
is covariant under local SU(3)« transformations of the form 


p — v =expligsa(x) - A/2)v. (14.31) 


This is implemented by the replacement 


On — On + igs Mt As, (2) (14.32) 
in the Dirac equation for the quarks, which leads immediately to (14.29) and 
the vertex (14.30). 

Of course, the assumption of local SU(3), covariance leads to a great deal 
more: for example, it implies that the gluons are massless vector (spin 1) 
particles, and that they interact with themselves via three-gluon and four- 
gluon vertices, which are the SU(3). analogues of the SU(2) vertices discussed 
in section 13.3.2. The most compact way of summarizing all this structure is 
via the Lagrangian, most of which we have already introduced in chapter 13. 
Gathering together (13.71) and (13.140) (adapted to SU(3).), we write it out 
here for convenience: 


4 Ty ^ 1; Auv 
Lacp = yo ĝt aP 2 m)agdt,a — q1P2r 
flavours f 
Lenk monte — 
- gc 0 ADU, AC) uil Dl (14.33) 


In (14.33), repeated indices are as usual summed over: a and 8 are SU(3).- 
triplet indices running from 1 to 3, and a, b are SU(3),-octet indices running 
from 1 to 8. The covariant derivatives are defined by 


X EE n 
(Du)ag = Abas + i9s5(Aa)as Aan (14.34) 


when acting on the quark SU(3). triplet, as in (13.53), and by 
(Dy)av = O,5ab + Js feabÂcn (14.35) 


when acting on the octet of ghost fields. For the second of these, note that 
the matrices representing the SU(3) generators in the octet representation are 
as given in (12.84), and these take the place of the ‘A/2’ in (14.34) (compare 
(13.141) in the SU(2) case). We remind the reader that the last two terms 
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in (14.33) are the gauge-fixing and ghost terms, respectively, appropriate to 
a gauge field propagator of the form (13.69) (with à;; replaced by ôa» here). 
The Feynman rules following from (14.33) are given in appendix Q. 

As remarked in section 12.3.2, the fact that the QCD interactions (14.33) 
are ‘flavour-blind’ implies that the global flavour symmetries discussed in 
chapter 12 are all preserved by QCD. These include the conservation of each 
quark flavour (for example, the number of strange quarks minus the number 
of strange antiquarks is conserved); and the symmetries SU(2); and SU(3)s, 
and the chiral symmetries SU(2)s5¢ and SU(3)sr, to the extent that these latter 
are good symmetries. Further, (14.33) conserves the discrete symmetries P, 
C and T, in a manner quite analogous to QED, already covered in section 7.5. 
In the case of P and T, the gluon fields Ao have the same transformation 
properties as the photon field As and the (normally ordered) SU(3)« currents 
Je = Qi" 1A dr transform in the same way as the electromagnetic current 
qy"q, ensuring P and T invariance. Under C, the quark fields transform as 
usual according to (7.151). Charge conjugation for the gluon field needs a 
little more care. The required rule is 


Cho Aan aN AG (14.36) 


The overall minus sign in (14.36) is analogous to that for the photon field 
(cf (7.152)). To understand the complex conjugate on the right-hand side of 
(14.36), recall from (7.153) that the complex scalar field ¢ = ss (hr — ida) 


transforms according to 
(ĝi — idg)C7 = by + ide. (14.37) 


Problem 14.2(a) verifies that the (normally ordered) interaction jf, Aq, is then 
C-invariant. As regards the term Fa, F/", we can write it as 


1 A R 
5 TO Fou f") (14.38) 


using the relation 
Tr(A5AÀ5) = 2045. (14.39) 


A short calculation (problem 14.2(b)) shows that A, F;,, transforms under 
C the same way as Ag Aa, (i.e. according to (14.36)). Using the complex 
conjugate of (14.39), it then follows that (14.38) is invariant under C. 


14.2.4 The 0-term 


In arriving at (14.33) we have relied essentially on the ‘gauge principle’ (in- 
variance under a local symmetry) and the requirement of renormalizability (to 
forbid the presence of terms with mass dimension higher than 4). The renor- 
malizability of such a theory was proved by 't Hooft (1971a, b). However, 
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there is in fact one more gauge invariant term of mass dimension 4 which can 
be written down, namely 


, 0g2 
£g apes twee P v po, (14.40) 


this is the ‘6-term’ of QCD. A full discussion of this term (see for example 
Weinberg 1996, section 23.6) is beyond our scope, but we shall give a brief 
introduction to the main ideas. 

The reader may wonder, first of all, whether the 6-term should give rise 
to a new Feynman rule. The answer to this begins by noting that (14.40) can 
actually be written as a total divergence: 


€uvpo FI" FP? = 0, R". (14.41) 


'This is more easily seen in the analogous term for QED, namely Euvpo F pv foo, 
We have 


user OP fron = €Epvpo (Ət A” = 9" Â”) (8P ÂF = a? A?) (14.42) 
= Aen 0 Â” Or AT (14.43) 
8" (de pupa A”? A”), (14.44) 


where we have used the antisymmetry of the e symbol in (14.43), and also in 
(14.44) since the contraction of e with the symmetric tensor 040° vanishes. 
We shall not need the explicit form of Ke. 

Any total divergence in a Lagrangian can be integrated to give only a 
‘surface’ term in the action, which can usually be discarded, making conven- 
tional assumptions about the vanishing of the fields at spatial infinity. There 
are, however, field configurations (‘instantons’) which do contribute to the 
0-term. Such configurations are not reachable in perturbation theory, and so 
no perturbative Feynman rules are associated with (14.40). They approach 
a pure gauge form at spatial infinity, and are therefore associated with the 
QCD vacuum state; their effect is equivalent to including the term (14.40) in 
the QCD Lagrangian (see for example Rajaraman 1982). 

The term (14.40) has potentially important phenomenological implica- 
tions, since it conserves C but violates both P and T (and hence also CP). 
Again, this is easy to see in the QED analogue term (14. 42), which equals 
8E-B (problem 14.3): we recall that under P, E > —E and B > B, while 
under T, E > E and B + —B. But we know (section 4.2) that strong in- 
teractions conserve both P and T to a high degree of accuracy. In particular, 
the neutron electric dipole moment dn, which would violate both P and T, is 
extremely small (see (4.133)). A very crude estimate of the size of dn, induced 
by the 6-term, is given by dimensional analysis as 


€ 
dy vo . 
yr (14.45) 


n 
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where M, is the neutron mass. This would imply 0 < 10712. In fact, this 
estimate is too restrictive, since it turns out (Weinberg 1996, section 23.6) 
that if any quark has zero mass, 0 can be reduced to zero by a global chiral 
U(1) transformation on that quark field. Although neither of the u and d 
quark masses are zero, they are small on a hadronic scale, and a suppression 
of (14.45) is expected, increasing the bound on theta. Estimates suggest 
0 < 107? — 10710, 

This may seem an unsatisfactorily special value to force on a dimensionless 
Lagrangian parameter, when there is nothing in the theory, a priori, to prevent 
something of order unity. This perceived difficulty is referred to as the ‘strong 
CP problem'. A possible solution to the problem, in which a very small value 
of 6 could arise naturally was suggested by Peccei and Quinn (1977a, 1977b). 
Their idea goes beyond the Standard Model, and involves the existence of a 
new very light pseudoscalar particle, the azion (Wilczek 1978, Winberg 1978). 

We proceed now with the main topic of this chapter, which is the applica- 
tion of perturbative QCD. 


E ETEEEELETIÉTÉTTEÉLÁTLTTTITIITETITIIIIIIIIIIÍIIÍIÍ 


14.3 Hard scattering processes, QCD tree graphs, and 
jets 


14.3.1 Introduction 


The fundamental distinctive feature of non-Abelian gauge theories is that they 
are ‘asymptotically free’, meaning that the effective coupling strength becomes 
progressively smaller at short distances, or high energies (Gross and Wilczek 
1973, Politzer 1973). This property is the most compelling theoretical motiva- 
tion for choosing a non-Abelian gauge theory for the strong interactions, and 
it enables a quantitative perturbative approach to be followed (in appropriate 
circumstances) even in strong interaction physics. This programme has in- 
deed been phenomenally successful, firmly establishing QCD as the theory of 
strong interactions, and now — in the era of the LHC - serving as a precision 
tool to guide searches for new physics. 

A proper understanding of how this works necessitates a considerable de- 
tour, however, into the physics of renormalization. In particular, we need to 
understand the important cluster of ideas going under the general heading of 
the ‘renormalization group’, and this will be the topic of chapter 15. For the 
moment we proceed with a discussion of some simple tree-level applications 
of QCD, which provided early confrontation of QCD with experiment. 

Let us begin by recapitulating, from a QCD-informed viewpoint, how 
the parton model successfully interpreted deep inelastic and large-Q? data 
in terms of almost free point-like partons — now to be identified with the QCD 
quanta: quarks, antiquarks, and gluons. 
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In section 9.5 we briefly introduced the idea of jets in e*e^ physics: two 
well collimated sprays of hadrons, apparently created as a quark-antiquark 
pair separate from each other at high speed. The angular distribution of 
the two jets followed closely the distribution expected from the parton-level 
process e*e^ — qq. The dynamics at the parton level was governed by 
QED, but QCD is responsible for the way the emerging q and q turn them- 
selves into hadrons, a process called parton fragmentation (it occurs for glu- 
ons too). We may think of it as proceeding in two stages. First, as the 
rapidly moving q and q begin to separate, they develop perturbative show- 
ers of narrowly collimated gluons and quark-antiquark pairs. Then, as the 
partons separate further, the strength of the forces between them increases, 
becoming strongly non-perturbative at a separation of about 1 fm, and en- 
suring that the coloured quanta are all confined into hadrons. As yet we 
do not have a completely quantitative dynamical understanding of the sec- 
ond, hadronization, stage: it is implemented by means of a model. Nev- 
ertheless, we can argue that for the forces to be strong enough to produce 
the observed hadrons, the dominant processes in hadronization must involve 
small momentum transfers — that is, the exchange of ‘soft’ quanta. Thus the 
emerging hadrons are also well collimated into two jets, whose energy and 
angular distributions reflect the short-distance physics at the parton level. 
This simple 2-jet picture will be extended in section 14.4, where we consider 
ete” — 3 jets. 

A somewhat different aspect of parton physics arose in sections 9.2-9.3, 
where we considered deep inelastic electron scattering from nucleons. There 
the initial state contained one hadron. Correspondingly, one parton appeared 
in the initial state of the parton-level interaction, and the analysis required 
new functions measuring the probabilities of finding a particular parton in the 
parent hadron - the parton distribution functions. These too are beyond the 
reach of perturbation theory. 


We may also consider, finally, hadron-hadron collisions. In this case, we 
need all three of the features we have been discussing: the parton distribu- 
tion functions, to provide the intial parton-parton state from the two-hadron 
state; the perturbative short-distance parton-parton interaction; and the par- 
ton fragmentation process in the final state. These three parts to the process 
are pictured in figure 14.5. The identification and analysis of short distance 
parton-parton interactions provide direct tests of the tree-graph structure of 
QCD, and perturbative corrections to it. 

'This three-part schematization of certain features of hadronic interactions 
is useful, because although we cannot yet calculate from first principles ei- 
ther the parton distribution functions or the fragmentation process, both are 
universal. The quark and gluon composition of hadrons is the same for all 
processes, and so measurements in one experiment can be used to predict 
the results of others. We saw an example of this in the Drell- Yan process of 
section 9.4. As regards the fragmentation stage, this too will be universal, pro- 
vided one is interested in sufficiently inclusive aspects of the final state. The 
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FIGURE 14.5 
Hadron-hadron collision involving parton-parton interaction followed by par- 
ton fragmentation. 


three-part scheme is called factorization, and it has been rigorously proved for 
some cases. We shall return to factorization in section 15.7. 

Let us turn now to some of the early data on parton-parton interactions 
in hadron-hadron collisions. 


14.3.2 Two-jet events in pp collisions 


How are short-distance parton-parton interactions to be identified experimen- 
tally? The answer is: in just the same way as Rutherford distinguished the 
presence of a small heavy scattering centre (the nucleus) in the atom: by look- 
ing at secondary particles emerging at large angles with respect to the beam 
direction. For each secondary particle we can define a transverse momentum 
pr = psin where p is the particle momentum and @ is the emission angle 
with respect to the beam axis. If hadronic matter were smooth and uniform 
(cf the Thomson atom), the distribution of events in pr would be expected 
to fall off very rapidly at large pr values — perhaps exponentially. This is 
just what is observed in the vast majority of events: the average value of pr 
measured for charged particles is very low ((pr) ^ 0.4 GeV), but in a small 
fraction of collisions the emission of high-pry secondaries is observed. They 
were first seen (Büsser et al. 1972, 1973, Alper et al. 1973, Banner et al. 
1982) at the CERN ISR (CMS energies 30-62 GeV), and were interpreted 
in parton terms as previously indicated. Referring to figure 14.5, a parton 
from one hadron undergoes a short-distance ‘hard scattering’ interaction with 
a parton from the other, leading in lowest-order perturbation theory to two 
wide-angle partons, which then fragment into two jets. 

We now face the experimental problem of picking out, from the enormous 
multiplicity of total events, just these hard scattering ones, in order to analyse 
them further. Early experiments used a trigger based on the detection of a 
single high-pr particle. But it turns out that such triggering really reduces 


14.8. Hard scattering processes, QCD tree graphs, and jets 89 


the probability of observing jets, since the probability that a single hadron in 
a jet will actually carry most of the jet’s total transverse momentum is quite 
small (Jacob and Landshoff 1978; Collins and Martin 1984, Chapter 5). It is 
much better to surround the collision volume with an array of calorimeters 
which measure the total energy deposited. Wide-angle jets can then be iden- 
tified by the occurrence of a large amount of total transverse energy deposited 
in a number of adjacent calorimeter cells: this is then a ‘jet trigger’. The 
importance of calorimetric triggers was first emphasized by Bjorken (1973), 
following earlier work by Berman, Bjorken and Kogut (1971). The applica- 
tion of this method to the detection and analysis of wide-angle jets was first 
reported by the UA2 collaboration at the CERN pp collider (Banner et al. 
1982). An impressive body of quite remarkably clean jet data was subse- 
quently accumulated by both the UA1 and UA2 collaborations (at /s = 546 
GeV and 630 GeV), and by the CDF and D0 collaborations at the FNAL 
Tevatron collider (ys = 1.8 TeV). 
For each event the total transverse energy $5 Er is measured where 


M Er = M. Ejsind;. (14.46) 


E; is the energy deposited in the ith calorimeter cell and 6; is the polar 
angle of the cell centre; the sum extends over all cells. Figure 14.6 shows the 
X ET distribution observed by UA2: it follows the ‘soft’ exponential form for 
YS ET < 60 GeV, but thereafter departs from it, showing clear evidence of the 
wide-angle collisions characteristic of hard processes. 

As we shall see shortly, the majority of ‘hard’ events are of two-jet type, 
with the jets sharing the X` Er approximately equally. Thus a ‘local’ trigger 
set to select events with localized transverse energy > 30 GeV and/or a ‘global’ 
trigger set at > 60 GeV can be used. At ys > 500-600 GeV there is plenty 
of energy available to produce such events. 

The total ys value is important for another reason. Consider the kinemat- 
ics of the two-parton collision (figure 14.5) in the pp CMS. As in the Drell- Yan 
process of section 9.4, the right-moving parton has 4-momentum 


zipi = 21(P,0,0, P) (14.47) 
and the left-moving one 
153p? = 13(P,0,0, — P) (14.48) 


where P = ,/s/2 and we are neglecting parton transverse momenta, which 
are approximately limited by the observed (pr) value (~ 0.4 GeV, and thus 
negligible on this energy scale). Consider the simple case of 90? scattering, 
which requires (for massless partons) zı = 22, equal to x say. The total 
outgoing transverse energy is then 2r P = zys. If this is to be greater than 
50 GeV, then partons with x > 0.1 will contribute to the process. The parton 
distribution functions are large at these relatively small x values, due to sea 
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FIGURE 14.6 
Distribution of the total transverse energy $5 ET observed in the UA2 central 
calorimeter (DiLella 1985). 


quarks (section 9.3) and gluons (figure 9.9), and thus we expect to obtain a 
reasonable cross section. 

What are the characteristics of jet events? When J` Er is large enough 
(> 150 GeV), it is found that essentially all of the transverse energy is indeed 
split roughly equally between two approximately back-to-back jets. A typical 
such event is shown in figure 14.7. Returning to the kinematics of (14.47) 
and (14.48), x; will not in general be equal to x2, so that — as is apparent in 
figure 14.7 — the jets will not be collinear. However, to the extent that the 
transverse parton momenta can be neglected, the jets will be coplanar with 
the beam direction, ie. their relative azimuthal angle will be 1809. Figure 
14.8 shows a number of examples in which the distribution of the transverse 
energy over the calorimeter cells is analyzed as a function of the jet opening 
angle 0 and the azimuthal angle $. It is strikingly evident that we are seeing 
precisely a kind of ‘Rutherford’ process, or - to vary the analogy — we might 
say that hadronic jets are acting as the modern counterpart of Faraday's iron 
filings, in rendering visible the underlying field dynamics! 

We may now consider more detailed features of these two-jet events — in 
particular, the expectations based on QCD tree graphs. The initial hadrons 
provide wide-band beams of quarks, antiquarks and gluons?; thus we shall 
have many parton subprocesses, such as qq > qq, qq > qq, qq — gg, gg — gg, 
etc. The most important, numerically, for a pp collider are qq > qq, gq > gd 


?]n the sense that the partons in hadrons have momentum or energy distributions, which 
are characteristic of their localization to hadronic dimensions. 
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FIGURE 14.7 

Two-jet event. Two tightly collimated groups of reconstructed charged tracks 
can be seen in the cylindrical central detector of UA1, associated with two 
large clusters of calorimeter energy depositions. Figure reprinted with per- 
mission from S Geer in High Energy Physics 1985, Proc. Yale Advanced Study 
Institute eds M J Bowick and F Gursey; copyright 1986 World Scientific Pub- 
lishing Company. 


FIGURE 14.8 

Four transverse energy distributions for events with $7 Er > 100 GeV, in 
the 0,9 plane (UA2, DiLella 1985). Each bin represents a cell of the UA2 
calorimeter. Note that the sum of the ¢’s equals 180° (mod 360°). 
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TABLE 14.1 
Spin-averaged squared matrix elements for one-gluon exchange (f-channel) 
processes. 

Subprocess |M]? 


qq > qq 4 (52) 
qq > qq t? 


qg > qg — 


and gg — gg. The cross section will be given, in the parton model, by a 
formula of the Drell-Yan type, except that the electromagnetic annihilation 
cross section 

o(qq > u* u^) = 4ra? /3¢? (14.49) 


is replaced by the various QCD subprocess cross sections, each one being 
weighted by the appropriate distribution functions. At first sight this seems to 
be a very complicated story, with so many contributing parton processes. But 
a significant simplification comes from the fact that in the CMS of the parton 
collision, all processes involving one gluon exchange will lead to essentially the 
same dominant angular distribution of Rutherford-type, ~ sin 40 /2, where 0 
is the parton CMS scattering angle (recall section 1.3.6). This is illustrated 
in table 14.1 (taken from Combridge et al. 1977), which lists the different 
relevant spin averaged, squared, one-gluon-exchange matrix elements | M |?, 
where the parton differential cross section is given by (cf (6.129)) 

do na2 
dcos@ . 28 


IM. (14.50) 


Here as = g2/4n, and â, f and á are the subprocess invariants, so that 
8=(x1p1 + xep2)” = 21228 (cf (9.84)). (14.51) 


Continuing to neglect the parton transverse momenta, the initial parton con- 
figuration shown in figure 14.5 can be brought to the parton CMS by a Lorentz 
transformation along the beam direction, the outgoing partons then emerging 
back-to-back at an angle 0 to the beam axis, so Ê œ (1—cos@) œ sin? 0/2. Only 
the terms in (¢)~? ~ sin * 0/2 are given in table 14.1. We note that the ŝ, f, à 
dependence of these terms is the same for the three types of process (and is in 
fact the same as that found for the 1y exchange process e 4^ — eu: see 
problem 8.17, converting do /dt into de/dcos0). Figure 14.9 shows the two 
jet angular distribution measured by UA1 (Arnison et al. 1985). The broken 
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FIGURE 14.9 
Two-jet angular distribution plotted against cos 0 (Arnison et al. 1985). 


curve is the exact angular distribution predicted by all the QCD tree graphs 
— it actually follows the sin * 0/2 shape quite closely. 

It is interesting to compare this angular distribution with the one predicted 
on the assumption that the exchanged gluon is a spinless particle, so that the 
vertices have the form ‘uu’ rather than 'üy,u'. Problem 14.4 shows that in 
this case the 1/f? factor in the cross section is completely cancelled, thus ruling 
out such a model. 

This analysis provides compelling evidence for elementary hard scatter- 
ing events proceeding via the exchange of a massless vector quantum. It is 
possible to go much further. Anticipating our later discussion, the small dis- 
crepancy between ‘tree graph’ theory (which is labelled ‘leading order QCD 
scaling curve’ in figure 14.9) and experiment can be accounted for by includ- 
ing corrections which are of higher order in as. The solid curve in figure 14.9 
includes QCD corrections beyond the tree level, involving the ‘running’ of the 
coupling constant a, and ‘scaling violation’ in the effective parton distribu- 
tion functions, both of which effects will be discussed in the following chapter. 
The corrections lead to good agreement with experiment. 

The fact that the angular distributions of all the subprocesses are so similar 
allows further information to be extracted from these two-jet data. In general, 
the parton model cross section will have the form (cf (9.91)) 


(14.52) 


d?o ay F,(xı) F (z2) doab—scd 

dx ,dzx2d cos 0 "nds T2 ER d cos 0 
where£F4(z1)/zi is the distribution function for partons of type ‘a’ (q, q or g), 
and similarly for Fi,(a2)/x2. Using the near identity of all dø/d cos 60's, and 
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FIGURE 14.10 

Effective distribution function measured from two-jet events (Arnison et al. 
1984 and Bagnaia et al. 1984). The broken and chain curves are obtained 
from deep inelastic neutrino scattering. Taken from DiLella (1985). 


noting the numerical factors in table 14.1, the sums over parton types reduce 
to 


Toles) + lar) +ga) Hara) + slate) +a(e2)} (14:53) 


where g(x), g(x) and q(x) are the gluon, quark and antiquark distribution 
functions. Thus effectively the weighted distribution function? 

F(a 4 z 

Z9) — g(a) + lale) + Gr) (14.54) 
is measured (Combridge and Maxwell, 1984); in fact, with the weights as in 


(14.53), 
da _ F(x1)  F(z2). dosege (14.55) 
dzıdzəd cos 0 Ti T2 dcos 0 
xı and x2 are kinematically determined from the measured jet variables: from 


(14.51), 


1x32 = 8/s (14.56) 
where § is the invariant [mass]? of the two-jet system and 
zı — x2 —2Py/ s (cf (9.82)) (14.57) 


3The $ reflects the relative strengths of the quark-gluon and gluon-gluon couplings in 
QCD; see problem 14.5. 
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FIGURE 14.11 

The gluon distribution function g(x) extracted from the effective distribu- 
tion function F(x) by subtracting the expected contribution from the quarks 
and antiquarks. Figure reprinted with permission from S Geer in High En- 
ergy Physics 1985, Proc. Yale Theoretical Advanced Study Institute, eds M J 
Bowick and F Gursey; copyright 1986 World Scientific Publishing Company. 


with Pr, the total two-jet longitudinal momentum. Figure 14.10 shows F (x)/x 
obtained in the UA1 (Arnison et al. 1984) and UA(2) (Bagnaia et al. 1984) 
experiments. Also shown in this figure is the expected F(x)/x based on con- 
temporary fits to the deep inelastic neutrino scattering data at Q? = 20 GeV? 
and 2000 GeV? (Abramovicz et al. 1982a,b, 1983); the reason for the change 
with Q? will be discussed in section 15.6. The agreement is qualitatively very 
satisfactory. Subtracting the distributions for quarks and antiquarks as found 
in deep inelastic lepton scattering, UA1 were able to deduce the gluon struc- 
ture function g(x) shown in figure 14.11. It is clear that gluon processes will 
dominate at small x — and even at larger x will be important because of the 
colour factors in table 14.1. 


14.3.3 Three-jet events in pp collisions 


Although most of the high-5> Er events at hadron colliders are two-jet events, 
in some 10-3096 of the cases the energy is shared between three jets. An 
example is included as (d) in the collection of figure 14.8; a clearer one is 
shown in figure 14.12. In QCD such events are interpreted as arising from 
a 2 parton — 2 parton + 1 gluon process of the type gg — ggg, gq > ggq, 
etc. Once again, one can calculate (Kunszt and Piétarinen 1980, Gottschalk 
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FIGURE 14.12 

Three-jet event in the UA1 detector, and the associated transverse energy flow 
plot. Figure reprinted with permission from S Geer in High Energy Physics 
1985, Proc. Yale Theoretical Advanced Study Institute, eds M J Bowick and 
F Gursey; copyright 1986 World Scientific Publishing Company. 
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FIGURE 14.13 
Some tree graphs associated with three-jet events. 


and Sivers 1980, Berends et al. 1981) all possible contributing tree graphs, 
of the kind shown in figure 14.13, which should dominate at small o4. They 
are collectively known as QCD single-bremsstrahlung diagrams. Analysis of 
triple jets which are well separated both from each other and from the beam 
directions shows that the data are in good agreement with these lowest-order 
QCD predictions. For example, figure 14.14 shows the production angular 
distribution of UA2 (Appel et al. 1986) as a function of cos6*, where 0* is 
the angle between the leading (most energetic) jet momentum and the beam 
axis, in the three-jet CMS. It follows just the same sin 4 0* /2 curve as in the 
two-jet case (the data for which are also shown in the figure), as expected 
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FIGURE 14.14 

The distribution of cos0*(e), the angle of the leading jet with respect to 
the beam line (normalized to unity at cos0* — 0), for three-jet events in 
pp collisions (Appel et al. 1986). The distribution for two-jet events is also 
shown (o). The full curve is a parton model calculation using the tree graph 
amplitudes for gg — ggg, and cut-offs in transverse momentum and angular 
separation to eliminate divergences (see remarks following equation (14.73)). 


for massless quantum exchange; the particular curve is for the representative 
process gg — ggg. 

Another qualitative feature is that the ratio of three-jet to two-jet events 
is controlled, roughly, by a, (compare figure 14.13 with the graphs in table 
14.1). Thus an estimate of a, can be obtained by comparing the rates of 
3-jet to 2-jet events in pp collisions. Other interesting predictions concern 
the characteristics of the 3-jet final state (for example, the distributions in 
the jet energy variables). At this point, however, it is convenient to leave pp 
collisions and consider instead 3-jet events in e*e^ collisions, for which the 
complications associated with the initial state hadrons are absent. 


ee 
14.4 3-jet events in ete” annihilation 


Three-jet events in ete~ collisions originate, according to QCD, from gluon 
bremsstrahlung corrections to the two-jet parton level process e*e^ > 7* > 
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FIGURE 14.15 
Gluon brehmsstrahlung corrections to two-jet parton level process. 


qd, as shown in figure 14.15.4 This phenomenon was predicted by Ellis et al. 
(1976) and subsequently observed by Brandelik et al. (1979) with the TASSO 
detector at PETRA, and Barber et al. (1979) with MARK-J at PETRA, 
thus providing early encouragement for QCD. The situation here is in many 
ways simpler and cleaner than in the pp case; the initial state ‘partons’ are 
perfectly physical QED quanta, and their total 4-momentum is zero, so that 
the three jets have to be coplanar; further, there is only one type of diagram 
compared to the large number in the pp case, and much of that diagram 
involves the easier vertices of QED. Since the calculation of the cross section 
predicted from figure 14.15 is relevant not only to three-jet production in e*e^ 
collisions, but also to a satisfactory definition of the two-jet production cross 
section, to QCD corrections to the total e*e^ annihilation cross section, and 
to scaling violations in deep inelastic scattering as well, we shall now consider 
it in some detail. It is important to emphasize at the outset that quark masses 
will be neglected in this calculation. 


14.4.1 Calculation of the parton-level cross section 


The quark, antiquark and gluon 4-momenta are p1, p2 and pa respectively, as 
shown in figure 14.15; the e^ and et 4-momenta are kı and ky. The cross 
section is then (cf (6.110) and (6.112)) 


1 a | Maag |? d?pı d?pa d?ps 
= — pi — p — p) L El SP3 (l. 
do x5? (ki + k2 — pı — pa — pa) 2Q? DE, 2E, 2E. (14.58) 
where (neglecting all masses) 
Cal? Ys àc (pit ps) 
: 5 m 5 Ae MAT P3) | 
Maag Q? O(k2)y"ulki) (a 2 2pi- pa Ypv(p2) 
z Àc T eT 

=al) . M etos) e" (Aja; (14.59) 


and Q? = AE? is the square of the total e*e- energy, and also the square of 
the virtual photon's 4-momentum Q, and ea (in units of e) is the charge of a 


4This is assuming that the total e*e- energy is far from the Z° mass; if not, the contri- 
bution from the intermediate Z° must be added to that from the photon. 
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quark of type ‘a’. Note the minus sign in (14.59): the antiquark coupling is 
—gs. In (14.59), e*"(A) is the polarization vector of the outgoing gluon with 
polarization A; ac is the colour wavefunction of the gluon (c = 1...... 8), 
and A. is the corresponding Gell-Mann matrix introduced in section 12.2; the 
colour parts of the q and q wavefunctions are understood to be included in 
the u and v factors; and (~i+ £3)/2p1 - pa is the virtual quark propagator 
(cf (L.6) in appendix L of volume 1) before gluon radiation, and similarly for 
the antiquark. Since the colour parts separate from the Dirac trace parts, we 
shall ignore them to begin with, and reinstate the result of the colour sum 
(via problem (14.5)) in the final answer (14.73). 

Averaging over e^ spins and summing over final state quark spins and 
gluon polarization A (using (8.171), and noting the discussion after (13.93)), 
we obtain (problem 14.6) 


1 2 ete?g? m» 
i 25 Mal —Qi L (kı, ka) Hy (p1, pas ps) (14.60) 
spins, 


where the lepton tensor is, as usual (equation (8.119)), 


LY’ (ky, k2) = 2(k] RY + ki ký — kı - kag"") (14.61) 
and the hadron tensor is 
1 
Haw(pi,po,ps) = pions [Luv (p2, ps) — Lyv(p1,p1) + Lyv(p1, 72)] 
1 
+ Lyv (Pi, pa) — Ly (pe, p 
nod m p3) pv (p2 2) 


+ Lyv (pi, pa) 
p p 
+ Gi pp, p bw Pa) + Lu(p1, pa) 


(pı : p3)(p2 ` pa 
+ Ly (pe, pa)] (14.62) 


Combining (14.61) and (14.62) allows complete expressions for the five-fold 
differential cross section to be obtained (Ellis et al. 1976). 

For the subsequent discussion it will be useful to integrate over the three 
angles describing the orientation (relative to the beam axis) of the produc- 
tion plane containing the three partons. After this integration, the (doubly 
differential) cross section is a function of two independent Lorentz invariant 
variables, which are conveniently taken to be two of the three si; defined by 


Since we are considering the massless case p? = 0 throughout, we may also 
write 


These variables are linearly related by 


2(pi:pa + p2- p3 + papi) = Q? (14.65) 
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FIGURE 14.16 
Virtual photon decaying to qqg. 


as follows from 
(pi + pat ps)? = Q? (14.66) 


and p? = 0. The integration yields (Ellis et al. 1976, 1977) 


d?o 2 22 1 (= 823 “| 
(Q3 um + ———— 


— =- 14. 
ds13ds93 3 ( eg 


$23 $13 $13523 


where as = g2 /4r. 

We may understand the form of this result in a simple way, as follows. It 
seems plausible that after integrating over the production angles, the lepton 
tensor will be proportional to Q?g”” , all directional knowledge of the k; having 
been lost. Indeed, if we use —g”” Lau (p, p) = 4p- p' together with (14.62) we 
easily find that 


1 . . . 2 S E 2Q?s 
x Quee. 5 Pa ps | prp _ P p2Q = Sia x4 l Q 2 
4 Dp2:pa Pı: P3 (pi - p3)(p2 p3) $23 $13 $13523 
(14.68) 


exactly the factor appearing in (14.67). In turn, the result may be given 
a simple physical interpretation. From (7.118) we note that we can replace 
—g"" by oy, e" (A )e"* (A") for a virtual photon of polarization A’, the A = 0 
state contributing negatively. Thus effectively the result of doing the angular 
integration is (up to constants and Q? factors) to replace the lepton factor 
U(ko)y"u(ki) by —ie"(A), so that JVfqgg is proportional to the y* — qqg 
processes shown in figure 14.16. But these are basically the same amplitudes 
as the ones we already met in Compton scattering (section 8.6). To compare 
with section 8.6.3, we convert the initial state fermion (electron/quark) into 
a final state antifermion (positron/antiquark) by p — —p, and then identify 
the variables of figure 14.16 with those of figure 8.14 (a) by 


pp k'>ps —p- p s — 2p: pa = $13 
t — 2p1 - pa = $12 u-—2p3:ps = s23. (14.69) 


Remembering that in (8.181) the virtual y had squared 4-momentum —Q?, 
we see that the Compton '; | M |?’ of (8.181) indeed becomes proportional 
to the factor (14.68), as expected. 
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FIGURE 14.17 
The kinematically allowed region in (z;) is the interior of the equilateral tri- 
angle. 


14.4.2 Soft and collinear divergences 


In three-body final states of the type under discussion here it is often conve- 
nient to preserve the symmetry between the s;;'s and use three (dimensionless) 
variables x; defined by 


$23 = Q?(1 — 21) and cyclic permutations. (14.70) 
These are related by (14.65), which becomes 
£1 + T2 + T3 = 2. (14.71) 


An event with a given value of the set x; can then be plotted as a point in an 
equilateral triangle of height 1, as shown in figure 14.17. In order to find the 
limits of the allowed physical region in this x; space, we now transform from 
the overall three-body CMS to the CMS of 2 and 3 (figure 14.18). If 0 is the 
angle between 1 and 3 in this system, then (problem 14.7) 


X39 = (1 T 21/2) + (21/2) cos Ü 
z3 = (1—m]/2)- (21/2) cosÓ. (14.72) 
The limits of the physical region are then clearly cos @ = +1, which correspond 


to 2 = 1 and z3 = 1. By symmetry, we see that the entire perimeter of the 
triangle in figure 14.17 is the required boundary: physical events fall anywhere 
inside the triangle. (This is the massless limit of the classic Dalitz plot, first 
introduced by Dalitz (1953) for the analysis of K — 37.) Lines of constant 0 
are shown in figure 14.17. 
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FIGURE 14.18 
Definition of 0. 


Now consider the distribution provided by the QCD bremsstrahlung pro- 
cess, equation (14.67), which can be written equivalently as 


do 205 gi ZH r 
Be ee 14. 
dr,dz; — P'3m (z =x) le =) eee) 


where apt is the pointlike e*e^ — hadrons total cross section of (9.99), and 
a factor of 4 has been introduced from the colour sum (problem 14.5). The 
factor in large parentheses is (14.68) written in terms of the x; (problem 14.8). 
The most striking feature of (14.73) is that it is infinite as zı or zo, or both, 
tend to 1 — and in such a way that the cross section integrated over xı and 
x2 diverges logarithmically. 

This is a quite different infinity from the ones encountered in the loop 
integrals of chapters 10 and 11. No integral over an arbitrarily large internal 
momentum is involved here — the tree amplitude itself is becoming singular on 
the phase space boundary. We can trace the origin of the singularity back to 
the denominator factors (p1: pa) | ~ (1 — z3) ! and (pa: pa3) ! ~ (1— 21)! 
in (14.59). These become zero in two distinct configurations of the gluon 
momentum: 


(i) p3 xpi or p3 x pa (using p? = 0) (14.74) 
(ii) p3 30 (14.75) 


which are easily interpreted physically. Condition (i) corresponds to a sit- 
uation in which the 4-momentum of the gluon is parallel to that of either 
the quark or the antiquark; this is called a ‘collinear divergence’ and the 
configuration is pictured in figure 14.19(a). If we restore the quark masses, 
p? = mi #0 and p2 = m3 Æ 0, then the factor (2p; - p3)~', for example, be- 
comes ((pi + p3)? — m2)! which only vanishes as p3 — 0, which is condition 
(ii). The divergence of type (i) is therefore also termed a ‘mass singularity’, 
as it would be absent if the quarks had mass. Condition (ii) corresponds to 
the emission of a very ‘soft’ gluon (figure 14.19(b)) and is called a ‘soft, or 
infrared, divergence’. In contrast to this, the gluon momentum ps in type (i) 
does not have to be vanishingly small. 


14.5. Definition of the two-jet cross section in e* e^ annihilation 103 


(000 — 0 o GR 


(a) (b) 


FIGURE 14.19 

Gluon configurations leading to divergences of equation (14.73): (a) gluon 
emitted approximately collinear with quark (or antiquark): (b) soft gluon 
emission. The events are viewed in the overall CMS. 


It is apparent from these figures that in either of these two cases the 
observed final state hadrons, after the fragmentation process, will in fact re- 
semble a two-jet configuration. Such events will be found in the regions xı ~ 1 
and/or x2 © 1 of the kinematical plot shown in figure 14.17, which correspond 
to strips adjacent to two of the boundaries of the triangle. Events outside these 
strips should be essentially three-jet events, corresponding to the emission of 
a hard, non-collinear gluon. To isolate such events, we must keep away from 
the boundaries of the triangle (the strip along the third boundary x3 = 1 will 
not contain a divergence, but will be included in a physical jet measure — see 
the following section). Thus to order o?o, the total annihilation cross section 
to three jets is given by the integral of (14.73) over a suitably defined inner 
triangular region in figure 14.17. 

Assuming such a separation of three- and two-jet events can be done sat- 
isfactorily (see the next section), their ratio carries important information — 
namely, it should be proportional to as. This follows simply from the extra 
factor of gs associated with the gluon emissions in figure 14.15. Glossing over 
a number of technicalities (for which the reader is referred to Ellis, Stirling 
and Webber 1996, section 3.3), we show in figure 14.20 a compilation of data 
on the fraction of three-jet events at different e*e^ annihilation energies. The 
most remarkable feature of this figure is, of course, that this fraction — and 
hence as — changes with energy, decreasing as the energy increases. This is, in 
fact, direct evidence for asymptotic freedom. A more recent comparison be- 
tween theory and experiment (the agreement is remarkable) will be presented 
in the following chapter, section 15.3, after we have introduced the theoretical 
framework for calculating the energy dependence of as. 


[:———— ———————————À—— — 


14.5 Definition of the two-jet cross section in ee~ 
annihilation 


As just noted, the integral of (14.73) over the remaining regions of figure 14.17, 
near the phase-space boundaries, will contribute to the two-jet annihilation 
cross section — and it is divergent. Clearly this is not a physically acceptable 
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FIGURE 14.20 
A compilation of three-jet fractions at different ete~ annihilation energies. 
Adapted from Akrawy et al. (OPAL) (1990); figure from R K Ellis, W J Stir- 
ling and B R Webber (1996) QCD and Collider Physics, courtesy Cambridge 
University Press. 


result: we want a finite two-jet cross section. The cure lies in recognizing 
that at the order to which we are working, namely o?o,, other parton-level 
graphs can contribute. These are the one-gluon loop graphs shown in figure 
14.21, which are of order œas. They turn out to contain exactly the same soft 
and collinear divergences, this time associated with configurations of virtual 
momenta inside the loops. In a carefully defined two-jet cross section, these 
two classes of divergences (one from real gluon emission, the other from virtual 
gluons) actually cancel. 

Let us call the amplitude for the sum of these three graphs Fy,, where ‘vg’ 
stands for virtual gluon. Fy, is the order as correction to the original order 
o parton-level graph of figure 9.17, shown here again in figure 14.22, with 
amplitude F}. The cross section from these contributions is proportional to 
|F; + Fyg|?. There are three terms in this expression: one of order a”, from 
|F,|?; another of order o?o2, from |Fyg|?, which we drop since it is of higher 
order in as; and an interference term of order o?o,, the same as (14.73). 
'Thus the interference term must be included in calculating the two-jet cross 
section to this order. When it is, the soft and collinear divergences cancel?: 
the resulting two-jet cross section is IRC (infrared and collinear) ‘safe’. 


5 The usual ultraviolet divergences in the loop graphs are removed by conventional renor- 
malization. 
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FIGURE 14.21 
Virtual gluon corrections to figure 14.20. 
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FIGURE 14.22 
One-photon annihilation amplitude in e*e^ — qq. 


This result was first shown by Sterman and Weinberg (1977), in a paper 
which initiated the modern treatment of jets within the framework of QCD. 
They defined the two-jet differential cross section to include those events in 
which all but a fraction e of the total e*e- energy E (= \/Q?) is emitted 
within some pair of oppositely directed cones of half-angle ó < 1, lying at an 
angle 0 to the e*e^ beam line. Including the contributions of real and virtual 
gluons up to order aos, the result is (Muta 2010, section 5.4.1) 


do do 4 ag 7? 5 
— = | — —-— 2 ——- 14. 
CONBCAN ECT E 


where ($2), is the contribution of the lowest-order graph, figure 14.22, which 


is given by equation (9.102) summed over quark colours and charges; terms of 
order ó and e, and higher powers, are neglected. It is evident from (14.76) that 
the jet parameters € and 6 serve to control the soft and collinear divergences, 
which reappear as € and ô tend to zero; they are ‘resolution parameters’. 
The remarkable cancellation of the soft and collinear divergences between 
the real and virtual emission processes is actually a general result in QED 
(recall that in chapter 11 we declined to pursue the problem of such infrared 
divergences). The Bloch—Nordsieck (1937) theorem states that ‘soft’ singu- 
larities cancel between real and virtual processes when one adds up all final 
states which are indistinguishable by virtue of the energy resolution of the 
apparatus. The Kinoshita (1962) Lee and Nauenberg (1964) theorem states, 
roughly speaking, that mass singularities are absent if one adds up all indis- 
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tinguishable mass-degenerate states. This is the reason for the finiteness of 
the Sterman- Weinberg 2-jet cross section, in an analogous QCD case. 

Returning to (14.76), it is important to note that the angular distribu- 
tion of this well-defined two-jet process is given precisely by the lowest-order 
expression (9.102), just as was hoped in the simple parton model of section 
9.5. Of course, the cross section depends on the jet parameters 6 and e. The 
formula (14.76) can be used, for example, to estimate the angular radius of 
the jets, as a function of E. 

Although the Sterman-Weinberg jet definition was historically the first, 
it is not the only possible one. Another, in some ways simpler, definition 
(Kramer and Lampe 1987) is directly phrased in terms of the offending de- 
nominators sj and s3} in (14.67). Let us introduce the dimensionless jet 
mass variables 

Yij = siz /Q? = 2E;E;(1 — COS 9:5 )/Q? (14.77) 


for any two partons i and 7; s12 will be included, though no singularity is 
involved. Here E; and E; are the (massless) parton energies, and 6;; is the 
angle between their 3-momenta, in the overall CMS. Then i and j are defined 
to be in one jet if y;; is less than some given number y. Note that for small 
0i, sij © E;E;07,/ Q^, so the single parameter y provides effectively both an 
energy and an angle cut. Clearly this definition is equivalent to a formulation 
in terms of strips 1 < £k < 1— y on figure 14.7, as discussed earlier. Including 
contributions, as before, from figures 14.22, 14.21, and 14.16, the resulting 
2-jet cross section is found to be (Kramer and Lampe 1987) 


2 ds 
02. jet = Cpt|1 — 3c Qh y + 3Iny — 4ylny + 1 — 7? /3)]. (14.78) 


Terms of order y were calculated numerically. These include the contribution 
from the (non-singular) region y12 < y, where the two quarks are in one jet 
and the other jet is a pure gluon jet. Plainly the IRC singularities have been 
eliminated from (14.78), at the cost of the jet mass resolution parameter y. 
Kramer and Lampe also calculated the order o2 corrections to (14.78). 

These two ways of regulating the IRC divergences in the 2-jet partonic 
cross section have each been extensively developed into jet algorithms, as we 
shall briefly discuss in section 14.6.2. 


E: SeSe 
14.6 Further developments 
14.6.1 Test of non-Abelian nature of QCD in ete” — 4 jets 


We have seen in section 14.3.1 how the colour factors associated with different 
QCD vertices (problem 14.5) play an important part in determining the rela- 
tive weights of different parton-level processes. The quark-gluon colour factor 
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Cp enters into the parton-level three-jet amplitude (14.67), but the triple- 
gluon vertex is not involved at order as. This vertex is an essential feature of 
non-Abelian gauge theories, being absent in Abelian theories such as QED. A 
direct measurement of the triple-gluon vertex colour factor, C4, can be made 
in the process ete” — 4 jets. 

4-jet events originate from the parton-level process e*e^ — qqg via three 
mechanisms: the emission of a second bremsstrahlung gluon, splitting of the 
first gluon into two gluons, and splitting of the first gluon into n; quark pairs. 
As problem 14.5 shows, these three types of splitting vertices are characterized 
in cross sections by the colour factors Cp,CA and n;Tg, so that the cross 
section can be written as (Ali and Kramer 2011) 


Os 
O4—jet = (=) Cr[Cropp + C AO gg + n£TROqg]. (14.79) 


Measurements yield (Abbiendi et al. 2001) 


Ca/Cr = 2.29+ 0.06[stat.] sp: 0.14[syst.] 
Tr/Cr = 0.38 0.03[stat.] =r 0.06[syst.], (14.80) 


in good agreement with the theoretical predictions C4/Cpr = 9/4 and Tg/Cp = 
3/8 in QCD. 


14.6.2 Jet algorithms 


From the examples already discussed in this chapter, it is clear that jets are an 
essential element in making comparisons between experimental measurements 
involving final state particles in detectors, and theoretical calculations at the 
parton level using perturbative QCD. Conceptually, jets provide a common 
representation for these two classes of event — those at the detector level, 
and those at the parton level. For precision comparisons, it is necessary to 
have a rigorous definition of a jet — a jet algorithm — which should be equally 
applicable at the detector, and at the parton, level. In the more than thirty 
years that have passed since Sterman and Weinberg’s 1977 paper, many jet 
definitions have been developed and applied. All involve the basic notion of 
clustering together objects that are in some sense ‘near’ to each other. Two 
main classes of jet algorithm may be distinguished: cone algorithms based on 
proximity in coordinate space, as in the Sterman-Weinberg approach, and used 
extensively, until recently, at hadron colliders; and sequential recombination 
algorithms based on proximity in momentum space, as in the jet-mass criterion 
of Kramer and Lampe (1987), and widely used at e*e^ and e p colliders. 
Recent general reviews of jet algorithms are provided by Salam (2010) and by 
Ali and Kramer (2011); see also Ellis et al. (2008), Campbell et al. (2007), 
and Kluth (2006). Here we shall give only a brief introduction to sequential 
recombination algorithms — all of which are IRC safe — since it seems likely 
that they will dominate future jet analyses. 
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The JADE algorithm (Bartel et al. 1986, Bethke et al. 1988) is a promi- 
nent early example of sequential recombination algorithms applied in e*e^ 
annihilation reactions. Particles are clustered in a jet iteratively as long as 
the quantity yi; of (14.77) is less than some prescribed value ye. If for some 
pair (i,j), yij < ye, particles i and j are combined into a compound object 
(with the resultant 4-momentum, typically), and the process continues by 
pairing the compound with a new particle k. The procedure stops when all 
Yij distances are greater than yc, and the compounds that remain at this stage 
are the jets, by definition. 


One drawback with this scheme is that in higher orders of perturbation 
theory one meets terms of the form o2 In?" y (generalizations of the o, In? y 
term in (14.78)). Such terms can be large enough to invalidate a perturbative 
approach. Also, it is possible for two soft particles moving in opposite direc- 
tions to get combined in the early stages of clustering, which runs counter 
to the intuitive notion of a jet being restricted in angular radius. The k- 
algorithm (Catani et al. 1991) avoids these problems by replacing the y;; of 
(14.77) by 


yij = 2min.[E?, E?](1 — cos6;;)/Q". (14.81) 


This amounts to defining ‘distance’ by the minimum transverse momentum 
k, of the particles in nearby collinear pairs. The use of the minimum energy 
ensures that the distance between two soft, back-to-back particles is larger 
than that between a soft particle and a hard one that is close to it in angle. 
The k; algorithm was widely used at LEP. 


The basic idea of the k; algorithm was extended to hadron colliders (Ellis 
and Soper 1993, Catani et al. 1993), where the total energy of the hard 
scattering particles is not well defined experimentally. The distance measure 
yij is replaced by 


dig = min.[p?? pz? ly; — y)? + (bi — 65Y]/ R2 (14.82) 


where, for particle 7, p,; is the transverse momentum with respect to the 
(beam) z-axis, y; is the rapidity along the beam axis (defined by y; = 3 In[(£;+ 
pzi)/(E;—pzi)]), $i is the azimuthal angle in the plane transverse to the beam, 
and R is a jet parameter. The variables y;, 6; have the property that they are 
invariant under boosts along the beam direction. In addition, recombination 
with the beam jets is controlled by the quantity d;; = kop , which is included 
along with the d;;'s when recombining all the particles into (i) jets with non- 
zero transverse momentum, and (ii) beam jets. The power parameter p takes 
the value 1 in the (extended) k, algorithm, and -1 in the ‘anti-k,’ algorithm 
(Cacciari et al. 2008). Whereas the former (and p = 0) leads to irregularly 
shaped jet boundaries, the latter leads to cone-like boundaries. The choice 
p= —1 was made in early LHC analyses. 
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Problems 
14.1 
(a) Show that the antisymmetric 3q combination of equation (14.2) 


is (i) a determinant, and (ii) invariant under the transformation 
(14.14) for each colour wavefunction. 


(b) Suppose that pa and qa stand for two SU(3), colour wavefunctions, 
transforming under an infinitesimal SU(3), transformation via 


p = (1 iq: A/2)p, 


and similarly for q. Consider the antisymmetric combination of 
their components, given by 


p293 — paqo Qi 
p3qı — Pı |=} Qe |; 
piqo — pon Q3 


that is, Qa = €«g4pgq4. Check that the three components Qa 
transform as a 3%, in the particular case for which only the pa- 
rameters 7,172,173 and ng are non-zero. [Note: you will need the 
explicit forms of the A matrices (appendix M); you need to verify 
the transformation law 


Q' = (1— iq: A*/2)9.] 
14.2 


(a) Verify that the normally ordered QCD interaction dr 3st Aan is 
C-invariant. 


(b) Show that A, Fi. transforms under C according to (14.36). 


14.3 Verify that the Lorentz-invariant ‘contraction’ e,,54 FH” F^? of two U(1) 
(Maxwell) field strength tensors is equal to 8E - B. 


14.4 Verify that the cross section for the exchange of a single massless scalar 
gluon between two quarks (or between a quark and an antiquark) contains no 
‘1/t” factor. 


14.5 This problem is concerned with the evaluation of various ‘colour factors’. 


(a) Consider first the colour factor needed for equation (14.73). The 
‘colour wavefunction’ part of the amplitude (14.59) is 


Y aces) even) (14.83) 


c 
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where cj, c and c3 label the colour degree of freedom of the quark, 
antiquark and gluon respectively, and the sum on the index c has 
been indicated explicitly. The y’s are the colour wavefunctions of 
the quark and antiquark, and are represented by three-component 
column vectors; a convenient choice is 


1 0 0 
xr)2|[ 0], xb2a| 1], xg9^| 0 (14.84) 
0 0 1 


by analogy with the spin wavefunctions of SU(2). The cross section 
is obtained by forming the modulus squared of (14.83) and summing 
over the colour labels c;: 


YS aaile) Doy oy eo) Dtm, (ases 


€,C1,C2,C3 


(14.85) 
where summation is understood on the matrix indices on the x's and 
A's, which have been indicated explicitly. In this form the expres- 
sion is very similar to the spin summations considered in chapter 8 
(cf equation (8.62)). We proceed to evaluate it as follows: 


(i) Show that 
3 xs(c2)xi (c2) = óa. 


c2 


(ii) Assuming the analogous result 


5 ac(c3)ag(c3) = bed 


c3 


show that (14.85) becomes 


where the (implied) sum on r runs from 1 to 3. 


(iii) The expression $7, 4¢4¢ is just the Casimir operator C2 (see 


section M.5 in appendix M) for SU(3) in the fundamental represen- 
tation 3, which from (M.67) has the value C13, where 13 is the 
unit 3 x 3 matrix, and Cr = 4/3. Hence show that the colour factor 
for (14.73) is 4. 


Note that if we averaged over the colours of the initial quark, or 
considered one particular colour, the colour factor would be Cp. 
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(b) The colour part for the triple gluon vertex g1 — go + gs is 


5 aj(c2)a; (ca) faecac(c1)- 


c,d,e 


Show that the modulus squared of this, averaged over the initial 
gluon colours and summed over the final gluon colours, is 


: 5 face faec, 


c,d,e 


where each of c,d,e runs from 1 to 8. Deduce using (12.84) that 
this expression can be written as 


1 
ee (Y: apap) | 
i d ee 


where Gm (d — 1...8) are the 8x 8 matrices representing the gen- 
erators of SU(3) in the 8-dimensional (adjoint) representation (see 
section 12.2). The expression (57, GOGO) is the SU(3) Casimir 
operator C2 in the adjoint representation, which from (M.67) has 
the value CA41s, where 1a is the 8 x 8 unit matrix, and C4 = 3. 
Hence show that the (averaged, summed) triple gluon vertex colour 
factor is C4 — 3. 


(c) The colour part of the g > q + q vertex is 


Ac 


Xs (c3)( 2 aXe (c2)ac(c1). 


Show that the modulus squared of this, averaged over the initial 
gluon colours and summed over the final quark colours is 


1 Ne Ae 1 
2G). 
This number is usually denoted by Trp. 
14.6 Verify equation (14.60). 


14.7 Verify equation (14.72). 


14.8 Verify that expression (14.68) becomes the factor in large parentheses in 
equation (14.73), when expressed in terms of the z;'s. 
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QCD II: Asymptotic Freedom, the 
Renormalization Group, and Scaling 
Violations 


In the previous chapter we learned that QCD amplitudes contributing to 
ete” — jets generally have IRC singularities, but that finite physical cross 
sections can be obtained by including together kinematically indistinguishable 
final states. The partial cross sections (for example o(ete~ — 2 jets)) will 
depend on the IRC cut-off parameter(s). What about the fully inclusive pro- 
cess e* e^ — hadrons, where all final states are summed over? At order ads, 
the parton-level diagrams contributing to this process are the same ones we 
considered in section 14.5, namely figures 14.16, 14.21 and 14.22. If we denote 
the amplitudes for these contributions by Fi, (for real gluon emission), Fig 
(for virtual gluon emission) and F} for the Born graph, then the partial cross 
section a(e*e- — 2 jets) includes |F,|*, the interference term 2Re(F; F7.), 
and the integral of |F;4|? over strips near the boundaries of figure 14.17. At 
this order, the partial cross section o(e* e^ — 3 jets) is given by the integral 
of |F;g|? over the remaining (interior) region of figure 14.17. The correspond- 
ing total cross section is thus simply the sum of |F, ^, 2Re(F, Fý), and the 
integral of |F;5|? over the whole of the x — x2 phase space. Clearly the IRC 
singularities will cancel, as in the 2-jet cross section, and the result will not 
depend on any IRC cut-off parameter. Indeed, the result is (see for example 
Muta 2010, section 5.1.2) 


a(ete” — hadrons) = oy«(Q?)(1 + as/7). (15.1) 


This fully inclusive cross section is finite and free of IRC cut-offs. 


At first sight, this result might appear satisfactory. It predicts a cross 
section somewhat greater than opt, as is observed in figure 14.1 — from which 
we might infer that a, ~ 0.5 or less. Assuming the expansion parameter is 
05/7, the implied perturbation series in powers of a, would seem to be rapidly 
convergent. However, this is an illusion, which is dispelled as soon as we go 
to the next order in as (i.e. to the order a?a? in the cross section). 
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FIGURE 15.1 
Some higher-order processes contributing to e*e^ — hadrons at the parton 
level. 
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15.1 Higher-order QCD corrections to c(e*e^ — hadrons): 
large logarithms 


Some typical graphs contributing to this order of the cross section are shown 
in figure 15.1 (note that, as with the O(a?a,) terms, some graphs will con- 
tribute via their modulus squared and some via interference terms). The 
result was obtained numerically by Dine and Saperstein (1979), and analyti- 
cally by Chetyrkin et al. (1979) and by Celmaster and Gonsalves (1980). For 
our present purposes, the crucial feature of the answer is the appearance of a 
term 


E) (15.2) 


where p is a mass scale (about which we shall shortly have a lot more to say, 
but which for the moment may be thought of as related in some way to an 
average quark mass), and the coefficient f is given by 


33 — 2N¢ 
fo = (Sm (15.3) 


where Nt is the number of ‘active’ flavours (e.g. Np = 5 above the bb thresh- 
old). The term (15.2) raises the following problem. The ratio between it and 
the O(aa,) term is clearly 


— Boas In(Q* /p?). (15.4) 


If we take Np = 5,04 ~ 0.4, y ~ 1 GeV and Q? ~ (10 GeV)?, (15.4) is of order 
1, and can in no sense be regarded as a small perturbation. Furthermore, the 
correction (15.4), by itself, would predict large scaling violations in this cross 
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section — that is, large Q?-dependent departures from the point-like Born cross 
section, pt(Q*). But the data actually follow the point-like prediction very 
well. 
Suppose that, nevertheless, we consider the sum of (15.1) and (15.2), which 
is m 
api[l  — (1 — boas (Q? /u°)}. (15.5) 


This suggests that one effect, at least, of these higher-order corrections is to 
convert a, to a Q?-dependent quantity, namely as{1 — Boas In(Q?/y?)}. We 
have seen something very like this before, in equation (11.56), for the case 
of QED. There is, however, one remarkable difference: here the coefficient of 
the In is negative, whereas that in (11.56) is positive. Apart from this (vital!) 
difference, however, we can reasonably begin to think in terms of an effective 
‘Q?-dependent strong coupling constant a,(Q?)’. 

Pressing on with the next order (a?a3) terms, we encounter a term (Samuel 
and Surguladze 1991, Gorishnii et al. 1991) 


ir = (15.6) 


Opt [as Bo In(Q?/p7) 
and the ratio between this and (15.2) is precisely (15.4) once again! We are 
now strongly inclined to suspect that we are seeing, in this class of terms, an 
expansion of the form (1+2)~! —1—z-4-z?— 2? .... If true, this would imply 
that all terms of the form (15.2) and (15.6), and higher, sum up to give (cf 
(11.63)) 
1+ EN NEN 

1 + asBoIn(Q?/u?) 


The ‘re-summation’ effected by (15.7) has a remarkable effect: the ‘dangerous’ 
large logarithms in (15.2) and (15.6) are now effectively in the denominator 
(cf (11.56)), and their effect is such as to reduce the effective value of as as 
Q? increases — exactly the property of asymptotic freedom. 

We hasten to say that of course this is not how the property was discovered 
— which was, rather, through the calculations of Politzer (1973) and Gross 
and Wilczek (1973). Prior to their work, it was widely believed that any 
quantum field theory would have a running coupling which behaved like that 
of QED which, as we saw in section 11.5.3, increases for large Q? (short 
distances). Such behaviour would make the scaling violations due to a term 
like (15.7) even worse. It was therefore a mystery how quantum field theory 
could account for the small scaling violations seen in the data. The discovery 
that the running couplings of non-Abelian gauge theories became weaker at 
large Q? opened the way for a quantitative understanding of parton-model 
scaling, and perturbative QCD corrections to it. 

To place the asymptotic freedom calculation in its proper context requires 
a considerable detour. Referring to our previous discussion, we may ask: are 
we guaranteed that still-higher-order terms will indeed continue to contain 
pieces corresponding to the expression of (15.7)? And what exactly is the 


(15.7) 


Opt 
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FIGURE 15.2 
One-loop vacuum polarization contribution to Z3. 


mass parameter u? Answering these questions will lead to the important 
body of ideas going under the name of the ‘renormalization group’. 


E 
15.2 The renormalization group and related ideas in QED 
15.2.1 Where do the large logs come from? 


We have taken the title of this section from that of Section 18.1 in Weinberg 
(1996), which we have found very illuminating, and to which we refer for a 
more detailed discussion. 


As we have just mentioned, the phenomenon of ‘large logarithms’ arises 
also in the simpler case of QED. There, however, the factor corresponding 
to oo ~ i is a/3m ~ 107%, so that it is only at quite unrealistically enor- 
mous |g?| values that the corresponding factor (a/37) In(|g?|/m2) (where me 
is the electron mass) becomes of order unity. Nevertheless, the origin of the 
logarithmic term is essentially the same in both cases, and the technicalities 
are much simpler for QED (no photon self-interactions, no ghosts). We shall 
therefore forget about QCD for a while, and concentrate on QED. Indeed, the 
discussion of renormalization of QED given in chapter 11 will be sufficient to 


answer the question in the title of this subsection. 

For the answer does, in fact, fundamentally have to do with renormaliza- 
tion. Let us go back to the renormalization of the charge in QED. We learned 
in chapter 11 that the renormalized charge e was given in terms of the ‘bare’ 

1 
charge eg by the relation e = e€9(Z2/Z1)Z? (see (11.6)), where in fact due 
1 
to the Ward identity Z, and Z2 are equal (section 11.6), so that only Z? is 
needed. To order e? in renormalized perturbation theory, including only the 
e*e- loop of figure 15.2, Z3 is given by (cf (11.31)) 


zP = 1 + nBl(o) (15.8) 
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where, from (11.23) and (11.24), 


dk a-a) 
r2 EO iugo. NC 
sif a JE o (15.9) 


and A, = m2 — z(1— z)q? with q? < 0. We regularize the k’ integral by a 
cut-off A as explained in sections 10.3.1 and 10.3.2, obtaining (problem 15.1) 


A ue nw 
IP! (q 5--S[ dz z(1— x) fa (E = = =) - acy}. 


(PFA? 
(15.10) 


Setting g? = 0 and retaining the dominant In A term, we find that 


(ze)? ES (=) In(A/m.). (15.11) 


It is not a coincidence that the coefficient a/3z of the ultraviolet divergence 
is also the coefficient of the In(|q?|/m2) term in (11.55)- (11.57); we need to 
understand why. 

We first recall how (11.55) was arrived at. It refers to the renormalized 
self-energy part, which is defined by the ‘subtracted’ form 


rri2 2 2 2 2 
TP (g?) = P(g?) — TP (0). (15.12) 


In the process of subtraction, the dependence on the cut-off A disappears and 
we are left with 


2 


= 2a [! m 
[2] (,2) — e 
IL (q^) = e dz z(1— x)ln woes Tem (15.13) 


as in equation (11.34). For large values of |g?| this leads to the ‘large log’ 
term (o/3z)In(|g?|/m2). Now, in order to form such a term, it is obviously 
not possible to have just ‘In |q?|’ appearing: the argument of the logarithm 
must be dimensionless, so that some mass scale must be present, to which 
[g?| can be compared. In the present case, that mass scale is evidently me, 
which entered via the quantity ri? (0), or equivalently via the renormalization 
constant zh (cf (15.11)). This is the beginning of the answer to our questions. 

Why is it m, that enters into 2! (0) or Z3? Part of the answer — once 
again — is of course that a ‘In A’ cannot appear in that form, but must be 
‘In(A/some mass)’. So we must enquire: what determines the ‘some mass’? 
With this question we have reached the heart of the problem (for the moment). 
The answer is, in fact, not immediately obvious: it lies in the prescription used 
to define the renormalized coupling constant; this prescription, whatever it is, 
determines Z3. 

The value (15.8) (or (11.31)) was determined from the requirement that 
the O(e?) corrected photon propagator (in € = 1 gauge) had the simple form 
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—igu,/q? as q? — 0; that is, as the photon goes on-shell. Now, this is a 
perfectly ‘natural’ definition of the renormalized charge — but it is by no means 
forced upon us. In fact the appearance of a singularity in zi as me — 0 
suggests that it is inappropriate to the case in which fermion masses are 
neglected. We could in principle choose a different value of q?, say q? = —p?, 
at which to ‘subtract’. Certainly the difference between ri? (qd? = 0) and 
me! (q? = —p7) is finite as A — oo, so such a redefinition of ‘the’ renormalized 
charge would only amount to a finite shift. Nevertheless, even a finite shift 
is alarming, to those accustomed to a certain ‘sanctity’ in the value a = i! 
We have to concede, however, that if the point of renormalization is to render 
amplitudes finite by taking certain constants from experiment, then any choice 
of such constants should be as good as any other — for example, the ‘charge’ 
defined at q? = — p? rather than at q? = 0. 

'Thus there is, actually, a considerable arbitrariness in the way renormal- 
ization can be done - a fact to which we did not draw attention in our earlier 
discussions in chapters 10 and 11. Nevertheless, it must somehow be the case 
that, despite this arbitrariness, physical results remain the same. We shall 
come back to this important point shortly. 


15.2.2 Changing the renormalization scale 
The recognition that the renormalization scale (—p? in this case) is arbitrary 


suggests a way in which we might exploit the situation, so as to avoid large 


*In(|g?|/ m2). terms: we renormalize at a large value of u?! Consider what 


happens if we define a new zi by 
ZP) = 1 - nBi(g? = p’). (15.14) 
Then for u? >> m2, but u? < A?, we have 
i 
(Pw)? =1- (E) may), (15.15) 
T 


and a new renormalized self-energy 


TM (g, u) = Pga) - Pg? = p) 
2 pl rd 
e mé + pea(1 — x) 
= —— d 1—z)ln| ——— —————— | . (15.1 
a J, x x( yin | Ste (15.16) 


For u? and —q? both > m2, the logarithm is now In(|g?|/u?) which is small 
when |q?| is of order u?. It seems, therefore, that with this different renor- 
malization prescription we have ‘tamed’ the large logarithms. 

However, we have forgotten that, for consistency, the ‘e’ we should now be 
using is the one defined, in terms of eo, via 


en = (ZP)? eo = (1- mao) e (15.17) 
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rather than 
£ 
2 (84 
e= (29) Pers (1 -> In(A/me)) eo, (15.18) 


working always to one-loop order with an ete~ loop. The relation between 
e, and e is then 


_ = gain) 
a + 


ES In(u/m.)) e (15.19) 
3c 

to leading order in a. Equation (15.19) indeed represents, as anticipated, 
a finite shift from ‘e’ to 'e,, but the problem with it is that a ‘large log’ 
has resurfaced in the form of In(j/ me) (remember that our idea was to take 
u? >> m2). Although the numerical coefficient of the log in (15.19) is certainly 
small, a similar procedure applied to QCD will involve the larger coefficient 
Goas as in (15.5), and the correction analogous to (15.19) will be of order 1, 
invalidating the approach. 

We have to be more subtle. Instead of making one jump from m2 to a 
large value u°, we need to proceed in stages. We can calculate e, from e as 
long as p is not too different from me. Then we can proceed to e, for p 
not too different from u, and so on. Rather than thinking of such a process 
in discrete stages me — u — u’ — ..., it is more convenient to consider 
infinitesimal steps — that is, we regard e, at the scale u’ as being a continuous 
function of e,, at scale u, and of whatever other dimensionless variables exist 
in the problem (since the e’s are themselves dimensionless). In the present 
case, these other variables are p’/ and m./pn, so that e,; must have the 
form 


ey = E(ey, p / p, me/ u). (15.20) 


Differentiating (15.20) with respect to u’ and letting yw’ = u we obtain 


de, 


di = Berl us Me/ H) (15.21) 


H 


where 


o 
Bem(Cu,Me/H) = az; E ew s men) : (15.22) 


z=1 


For u >> me equation (15.21) reduces to 


de, 


Pay = Bom(€u;0) = Bem(e,), (15.23) 


which is a form of Callan-Symanzik equation (Callan 1970, Symanzik 1970); 
it governs the change of the coupling constant e,, as the renormalization scale 
u changes. 
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To this one-loop order, it is easy to calculate the crucial quantity Gem(e,.). 
Returning to (15.17), we may write the bare coupling eo as 


ej = ey (1 — = In(A/u)) 
© or (1 = vm 
X ey (1 +% n(A/n)) (15.24) 


where the last step follows from the fact that e and e, differ by O(e?), which 
would be a higher-order correction to (15.24). Now the unrenormalized cou- 
pling is certainly independent of u. Hence, differentiating (15.24) with respect 
to u at fixed eo, we find 


dek 
du 


=0. (15.25) 


eo 
Working to order E we can drop the last term in (15.25), obtaining finally 
(to one-loop order) 


de, 
Pau 


3 
-- (5886). (15.26) 


€o 


We can now integrate equation (15.26) to obtain e,, at an arbitrary scale p, 
in terms of its value at some scale uj = M, chosen in practice large enough so 
that for variable scales u greater than M we can neglect me compared with p, 
but small enough so that ln(M/me) terms do not invalidate the perturbation 
theory calculation of em from e. The solution of (15.26) is then (problem 
15.2) 


1 1 
In(u/M) = 61? (= - =) (15.27) 
CM en 
or equivalently 
2 
2 -——H—A— —-, (15.28) 
E a2 In(u?/M?) 
which is A 
Op x (15.29) 


d= SM In (u?/M?) 
where œ = e?/4r. The crucial point is that the ‘large log’ is now in the 
denominator (and has coefficient am /37!). We note that the general solution 
of (15.23) may be written as 

^^ de 
em Bem (e) 
We have made progress in understanding how the coupling changes as 


the renormalization scale changes, and how ‘large logarithmic’ change as in 
(15.19) can be brought under control via (15.29). The final piece in the puzzle 


In(u/M) = (15.30) 
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is to understand how this can help us with the large —q? behaviour of our 
cross section, the problem we originally started from. 


15.2.8 The RGE and large -q? behaviour in QED 


To see the connection we need to implement the fundamental requirement, 
stated at the end of section 15.2.2, that predictions for physically measurable 
quantities must not depend on the renormalization scale u. Consider, for 
example, our annihilation cross section ø for e*e^ — hadrons, pretending 
that the one-loop corrections we are interested in are those due to QED rather 
than QCD. We need to work in the spacelike region, so as to be consistent 
with all the foregoing discussion. To make this clear, we shall now denote 
the 4-momentum of the virtual photon by q rather than Q, and take q? « 
0 as in sections 15.2.1 and 15.2.2. Bearing in mind the way we used the 
‘dimensionless-ness’ of the e’s in (15.20), let us focus on the dimensionless 
ratio o/op, = S. Neglecting all masses, S can only be a function of the 
dimensionless ratio |g?]|/u? and of e,: 


S = S(\a?|/u?,e,): (15.31) 


But S must ultimately have no u dependence. It follows that the u? depen- 
dence arising via the |q?|/u? argument must cancel that associated with ep. 
This is why the j/?-dependence of e,, controls the |g?| dependence of S, and 
hence of c. In symbols, this condition is represented by the equation 


o de, o DP 8 
(5 " du M" x) S (Iq l/u sén) = 0, (15.32) 
or 
p| + Bemlen) z ) S (la?/n2,e,) =0 (15.33) 
Silo HE Ge a= | 


Equation (15.33) is referred to as ‘the renormalization group equation 
(RGE) for S". The terminology goes back to Stueckelberg and Peterman 
(1953), who were the first to discuss the freedom associated with the choice 
of renormalization scale. The ‘group’ connotation is a trifle obscure — but all 
it really amounts to is the idea that if we do one infinitesimal shift in ?, and 
then another, the result will be a third such shift; in other words, it is a kind of 
‘translation group’. It was, however, Gell-Mann and Low (1954) who realized 
how equation (15.33) could be used to calculate the large |q?| behaviour of S, 
as we now explain. 

It is convenient to work in terms of u? and a rather than ys and e. Equation 
(15.33) is then 


ð 
+ Bonn) 52> S (lq?|/u?, on) = 0, (15.34) 
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where Bem(a,,) is defined by 


200 


Bem(Qp) = H p M (15.35) 
From (15.35) and (15.26) we deduce that, to the one-loop order to which we 


are working, 


2 
€ Q 
88. (ou) = 188i (eu) = 3 (15.36) 
Now introduce the important variable 
t = In(|q°|/u”). (15.37) 
Equation (15.34) then becomes 
o o 
E + Bele) S (è, ap) — 0. (15.38) 


This is a first-order differential equation which can be solved by implicitly 
defining a new function — the running coupling a(\q?|) — as follows (compare 


(15.30): 
al?) da 
ie li 2 Aa (15.39) 


To see how this helps, we have to recall how to differentiate an integral with 
respect to one of its limits — or, more generally, the formulae 


f(a) 
2f | glade = 9(F(a)) SE. (15.40) 


First, let us differentiate (15.39) with respect to t at fixed a,; we obtain 


_ 1 — bale?) 
Bee) ot c 


Next, differentiate (15.39) with respect to a, at fixed t (note that a(|g?]) will 
depend on p and hence on a,,); we obtain 


1 (15.41) 


doll?) — 1 01 
Vea. POU) Paa p 


the minus sign coming from the fact that a, is the lower limit in (15.39). 
From (15.41) and (15.42) we find 


o 


—z; + bem(ap) 


—— 2 = 
ET Da a(l“) = 0. (15.43) 


It follows that $(1,a(|q?])) is a solution of (15.38). 
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This is a remarkable result. It shows that all the dependence of S on 
the (momentum)? variable |q?| enters through that of the running coupling 
a(|q?|). Of course, this result is only valid in a regime of —q? which is much 
greater than all quantities with dimension (mass)? — for example the squares of 
all particle masses, which do not appear in (15.31). This is why the technique 
applies only at ‘high’ —g?. The result implies that if we can calculate S(1, o) 
(i.e. S at the point q? = —p”) at some definite order in perturbation theory, 
then replacing a,, by a(|q?|) will allow us to predict the q?-dependence (at 
large —q”). All we need to do is solve (15.39). Indeed, for QED with one 


e*e- loop we have seen that Bel (o) = o?[3s. Hence integrating (15.39) we 


obtain 
Qu Ou 


1-38 1—3% hn(e^[/u*) 


This is almost exactly the formula we proposed in (11.57), on plausibility 
grounds.! 

Suppose now that the leading QED perturbative contribution to S(1, œp) 
is Sia, Then the terms contained in S(1,a(|q?|)) in this approximation can 
be found by expanding in powers of o: 


a(|q?|) = (15.44) 


a —1 
Siol) = 1-5e(e- 14 Sia, [1 - 2&4 
z dae pde a (15.45) 
i ES 3n 3n ans ' 


where t = In(|g?|/u?). The next-higher-order calculation of S(1, œ) would be 
S502, say, which generates the terms 


apt 
S2a*(\q?|) = S207, s = +. (15.46) 
T 


Comparing (15.45) and (15.46) we see that each power of the large log factor 
appearing in (15.46) comes with one more power of a,, than in (15.45). Pro- 
vided q, is small, then, the leading terms in t, t?, ... are contained in (15.45). 
It is in this sense that replacing S(1,o,,) by S(1,a(|q?|)) sums all ‘leading log 
terms’. 

In fact, of course, the one-loop (and higher) corrections to S in which we 
are really interested are those due to QCD, rather than QED, corrections. But 
the logic is exactly the same. The leading (O(o&)) perturbative contribution 
to S = c/oy at q? = —p? is given in (15.1) as as(u?)/m. It follows that 
the ‘leading log corrections’ at high —q? are summed up by replacing this 
expression by o.(|g?|)/z, where the running o&(|g?|) is determined by solving 
(15.39) with the QCD analogue of (15.36) — to which we now turn. 


1The difference has to do, of course, with the different renormalization prescriptions. Eq 
(11.57) is written in terms of an ‘a’ defined at q? — 0, and without neglect of me. 


124 15. QCD II: Asymptotic Freedom, the Renormalization Group 


SS ee 


15.3 Back to QCD: asymptotic freedom 
15.3.1 One loop calculation 


The reader will of course have realized, some time back, that the quantity 8o 
introduced in (15.3) must be precisely the coefficient of a? in the one-loop 
contribution to the 6-function of QCD defined by 


pm T fixed bare as SED 
that is to say, 
fl. (one loop) = — boa? (15.48) 
with 
Bo — mE (15.49) 


For N; < 16 the quantity £o is positive, so that the sign of (15.48)) is opposite 
to that of the QED analogue, equation (15.36). Correspondingly, (15.44) is 
replaced by 


2 as(u?) 

as(|q"|) i + alu?) Bo ln(Q? 7p] (15.50) 

where Q? = |q?|.? Then replacing a, in (15.1) by (15.50) leads to (15.7). 
Thus in QCD the strong coupling runs in the opposite way to QED, be- 
coming smaller at large values of Q? (or small distances) — the property of 
asymptotic freedom. The justly famous result (15.49) was first obtained by 
Politzer (1973), Gross and Wilczek (1973), and ’t Hooft. 't Hooft’s result, 
announced at a conference in Marseilles in 1972, was not published. The 
published calculation of Politzer and of Gross and Wilczek quickly attracted 
enormous interest, because it immediately offered a way to understand how 
the successful parton model could be reconciled with the undoubtedly very 
strong binding forces between quarks. The resolution, we now understand, 
lies in quite subtle properties of renormalized quantum field theory, involving 
first the exposure of ‘large logarithms’, then their re-summation in terms of 
the running coupling, and of course the crucial sign of the 6-function. Not 
only did the result (15.49) explain the success of the parton model: it also, 
we repeat, opened the prospect of performing reliable perturbative calcula- 
tions in a strongly interacting theory, at least at high Q7. For example, at 
sufficiently high Q?, we can reliably compute the 6 function in perturbation 
theory. The result of Politzer and of Gross and Wilczek, when combined with 


?Except that in (15.50) as is evaluated at large spacelike values of q?, whereas in (15.7) 
it is wanted at large timelike values. Readers troubled by this may consult Peskin and 
Schroeder (1995) section 18.5. The difficulty is evaded in the approach of section 15.6 
below. 
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al 


FIGURE 15.3 
qq vacuum polarization correction to the gluon propagator. 


the motivations for a colour SU(3) group discussed in the previous chapter, led 
rapidly to the general acceptance of QCD as the theory of strong interactions, 
a conclusion reinforced by the demonstration by Coleman and Gross (1973) 
that no theory without Yang-Mills fields possessed the property of asymptotic 
freedom. 

In section 11.5.3 we gave the conventional physical interpretation of the 
way in which the running of the QED coupling tends to increase its value at 
distances short enough to probe inside the screening provided by e*e^ pairs 
(|g|-! < mz!). This vacuum polarization screening effect is also present in 
(15.49) via the term —2Ne the value of which can be quite easily understood. 
It arises from the ‘qq’ vacuum polarization diagram of figure 15.3, which is 
precisely analogous to the e*e^ diagram used to calculate i (q7) in QED. 
The only new feature in figure 15.3 is the presence of the 4-matrices at each 
vertex. If ‘a’ and ‘b’ are the colour labels of the ingoing and outgoing gluons, 
the 4-matrix factors must be 


p» i. (2). (15.51) 


since there are no free quark indices (of type a, 8) on the external legs of the 
diagram. It is simple to check that (15.51) has the value 45ab (this is, in fact, 
the way the A's are conventionally normalized). Hence for one quark flavour 
we expect '&/37- to be replaced by ‘a,/67’, in agreement with the second 
term in (15.49). 

'The all-important, positive, first term must therefore be due to the gluons. 
The one-loop graphs contributing to the calculation of 89 are shown in figure 
15.4. They include figure 15.3, of course, but there are also, characteristically, 
graphs involving the gluon self-coupling which is absent in QED, and also (in 
covariant gauges) ghost loops. We do not want to enter into the details of the 
calculation of 8 (os) here (they are given in Peskin and Schroeder 1995, chapter 
16, for example), but it would be nice to have a simple intuitive picture of the 
‘antiscreening’ result in terms of the gluon interactions, say. Unfortunately no 
fully satisfactory simple explanation exists, though the reader may be inter- 
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FIGURE 15.4 

Graphs contributing to the one-loop £ function in QCD. The curly line rep- 
resents a gluon, a dotted line a ghost (see section 13.3.3 ) and a straight line 
a quark. 


ested to consult Hughes (1980, 1981) and Nielsen (1981) for a ‘paramagnetic’ 
type of explanation, rather than a ‘dielectric’ one. 

Returning to (15.50), we note that the equation effectively provides a pre- 
diction of a, at any scale Q?, given its value at a particular scale Q? = p?, 
which must be taken from experiment. The reference scale is now normally 
taken to be the Z? mass; the value as(m3) then plays the role in QCD that 
a ~ 1/137 does in QED. 

Despite appearances, equation (15.50) does not really involve two param- 
eters — after all, (15.47) is only a first-order differential equation. By intro- 
ducing 


In A3cp = Inp2 — 1/(&os(u?)), (15.52) 
equation (15.50) can be rewritten (problem 15.3) as 
1 
a;(Q?) 2 —— 15.53 
(0) = Bin c) MER 
Equation (15.53) is equivalent to (cf (15.30)) 
ia da 
IQ Agen) = / ———— 15.54 
"(2^7 Qo») o, (Q2) Bs(one loop) ( ) 
with B,(one loop) = —6oa2. Aocp is therefore an integration constant, rep- 


resenting the scale at which a, would diverge to infinity (if we extended our 
calculation beyond its perturbative domain of validity). More usefully, Agcp 
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is a measure of the scale at which ag really does become ‘strong’. The extrac- 
tion of a value of Agcp is a somewhat complicated matter, as we shall briefly 
indicate in the following section, but a typical value is in the region of 200 
MeV. Note that this is a distance scale of order (200 MeV)~! ~ 1 fm, just 
about the size of a hadron — a satisfactory connection. 


15.3.2 Higher-order calculations, and experimental compar- 
ison 

So far we have discussed only the ‘one-loop’ calculation of (as). The general 

perturbative expansion for £, can be written as 


Balas) = —Boo2 — b103 — Baos +... (15.55) 


where f is the one-loop coefficient given in (15.49), £1 is the two-loop coeffi- 
cient, and so on. £1 was calculated by Caswell (1974) and Jones (1974), and 


has the value 
| 153 — 19N; 


Pa = oa 
The three-loop coefficient 82, obtained by Tarasov et al. (1980) and by Larin 
and Vermaseren (1993), is 


(15.56) 


|. 77139 — 15099N; + 325N? 
kn 345672 i 


The four-loop coefficient 33 was calculated by van Ritbergen et al. (1997) and 
by Czakon (2005); we shall not give it here. A technical point to note is that 
while 69 and £8, are independent of the scheme adopted for renormalization 
(see appendix O), the higher-order coefficients do depend on it; the value 
(15.57) is in the widely used MS scheme. Likewise, Agcp will be scheme- 
dependent (see appendix O), and the value Ayg will be used here (the ‘QCD’ 
now being understood). 

Only in the one-loop approximation for s can an analytic solution of 
(15.47) be obtained. However, a useful approximate solution can be found 
iteratively, as follows. Consider the two-loop version of (15.54), namely 


(15.57) 


da 
In(Q?/A2.—) = — J ———ÀÁ. 15. 
"(2 Pis) =~ J Road Bad DS 
Expanding the denominator and integrating gives 
1 bi 
In(Q?/A2.—) = +—Ina,+C, 15.59 
(Q/Mgg = Zo + F (15.59) 


where bı = 6;/8 and C is a constant. In the MS scheme, C is given by 
C = (61/80) 1n 8o. Then the equation for ag is 


1 bi 
Boas o Po ( ) 


L 
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where we have defined L = In(Q?/A2_). In first approximation, one sets bı 
to zero and finds o4 = (1/89) as before. To obtain the next approximation, 
we set as = (1/8oL) in the bı term of (15.60), and solve for ag to first order 
in bı. This gives (problem 15.4 (a)) 


1 1 


Qs 


Problem 15.4 (b) carries the calculation to the three-loop stage. 
The current world average value of o4(mZ) is (Bethke 2009) 


as(mz) = 0.1184 + 0.0007. (15.62) 


The remarkable precision of this number represents extraordinary consistency 
among the many methods used to determine it?, which include deep inelastic 
scattering, electroweak fits, e* e^ — jets, and lattice calculations (see chapter 
16). If (15.62) is used to determine Ayg from (15.61), one finds Axis = 231 
MeV; using the 3-loop formula of problem 15.4 (b) gives Ayg = 213 MeV 
(Bethke 2009). 

These values of Ayqg are for Nr; = 5, appropriate for the Z° mass region, 
well above the b threshold. As Q? runs to smaller values, and a quark mass 
threshold is crossed, N¢ changes by one unit, and so correspondingly do the 
coefficients £o, £1,.... Physical quantities must however be continuous across 
a quark threshold. This requires that the values of o4 above and below that 
threshold satisfy certain matching conditions (Rodrigo and Santamaria 1993, 
Bernreuther and Wetzel 1982, Chetyrkin et al. 1997). These are satisfied by 
allowing Ayg to depend on Ng. At one and two loop order, the matching con- 
dition is simply aft) Lg (NO 
in terms of ACE and ADI. In higher orders the matching conditions 
contain additional terms, which are required at (n—1)-loop order for an n-loop 
calculation of ag. 

Figure 15.5 shows a summary (Bethke 2009) of measurements of o4 as 
a function of the energy scale Q, compared with the QCD prediction. The 
latter is evaluated in 4-loop approximation, using 3-loop threshold matching 
conditions at the masses m, = 1.5 GeV and mp = 4.7 GeV. The agreement is 
perfect, a triumph for both experiment and theory. 


as , which can be straightforwardly implemented 
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15.4 oc(e*e- — hadrons) revisited 


We may now return to the physical process which originally motivated this 
extensive detour. The perturbative corrections to op¢(Q*) are expressed as a 


3With the exception of a long-standing systematic difference: results from structure 
functions prefer a smaller value of as (m2) than most of the others. 
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FIGURE 15.5 
Comparison between measurements of a, and the theoretical prediction, as a 
function of the energy scale Q (Bethke 2009). (See color plate I.) 


power series in Qs, 


c(e*e^ — hadrons) = oy.(Q?) 


rere (=a , (15.63) 


where yu is the renormalization scale. (A similar expansion can be written for 
many other physical quantities too.) The coefficients from c» onwards depend 
on the renormalization scheme (see appendix O), and are usually quoted in 
the MS scheme. c is the leading order (LO) coefficient, and we already know 
that c; = 1 from (15.1). c3 is the next-to-leading (NLO) coefficient; c2(1) 
was calculated by Dine and Sapirstein (1979), Chetyrkin et al. (1979) and by 
Celmaster and Gonsalves (1980), and has the value 1.9857 — 0.1152N;. The 
next-to-next-to-leading (NNLO) coefficient c3(1) was calculated by Samuel 
and Surguladze (1991) and by Gorishnii et al. (1991), and is equal to -12.8 
for five flavours. The N3LO coefficient c4(1) (which requires the evaluation of 
some twenty thousand diagrams) may be found in Baikov et al. (2008) and 
Baikov et al. (2009). 

The physical cross section o(e* e^ — hadrons) must be independent of the 
renormalization scale 2, and this would also be true of the series in (15.63) if 
an infinite number of terms were kept: the u?-dependence of the coefficients 
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c5 (Q?/u?) would cancel that of as(u?). This requirement can be imposed 
order by order in a, to fix the j/?-dependence of the coefficients, and is a 
direct way of applying the RGE idea. Consider, for example, truncating the 
series at the n = 2 stage: 


c(e*e- — hadrons) ~ op+(Q?) (1 + osu") + €2(Q?/u?)(as(u?)/m)? 


(15.64) 
Differentiating with respect to u? and setting the result to zero we obtain 


2de3 _ _ mB(as(u?)) (15.65) 


| eee 
du? (os (u?))? 

where an O(a3) term has been dropped. Substituting the one-loop result 
(15.48) — as is consistent to this order — we find 


cx( Q^ /p?) = e2(1) — Bo In(Q?/p”). (15.66) 


The second term on the right-hand side of (15.66) gives the contribution iden- 
tified in (15.2). 

In practice only a finite number of terms n = N will be available, and a p?- 
dependence will remain, which implies an uncertainty in the prediction of the 
cross section (and similar physical observables), due to the arbitrariness of the 
scale choice. This uncertainty will be of the same order as the neglected terms, 
ie. of order o *!. Thus the scale dependence of a QCD prediction gives a 
measure of the uncertainties due to neglected terms. For e(e* e^ — hadrons) 
the choice of scale u? = Q? is usually made, so as to avoid large logarithms 
in relations such as (15.66). 

Before proceeding to our second main application of the RGE, scaling 
violations in deep inelastic scattering, it is necessary to take another detour, 
to enlarge our understanding of the scope of the RGE. 
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15.5 A more general form of the RGE: anomalous 
dimensions and running masses 


'The reader may have wondered why, for QCD, all the graphs of figure 15.6 
are needed, whereas for QED we got away with only figure 11.3. The reason 
for the simplification in QED was the equality between the renormalization 
constants Zı and Z2, which therefore cancelled out in the relation between 
the renormalized and bare charges e and eo, as briefly stated before equation 
(15.8) (this equality was discussed in section 11.6). We recall that Zı is the 
field strength renormalization factor for the charged fermion in QED, and Zi 
is the vertex part renormalization constant; their relation to the counter terms 
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was given in equation (11.7). For QCD, although gauge invariance does imply 
generalizations of the Ward identity used to prove Zi = Zə (Taylor 1971, 
Slavnov 1972), the consequence is no longer the simple relation ‘Z, = Zo’ 
in this case, due essentially to the ghost contributions. In order to see what 
change Z4 Æ Z2 would make, let us return to the one-loop calculation of 8 for 
QED, pretending that Zı 4 Z2. We have 


mim e (15.67) 


where, because we are renormalizing at scale u, all the Z;'s depend on p (as 
n (15.15)), but we shall now not indicate this explicitly. Taking logs and 
differentiating with respect to u at constant eg, we obtain 


d 1 d d 
hc mZ-u>—| mZ2—--u—| do | =0. (15.68) 
du feg du feg 2 dufe eu du m 
Hence 
Peje =e ea ae (15.69) 
uw) = du ^ = €yu^y3 u^ 2 dn 1; : 
where i T: 
= —| hz = —-u——| lnZsa. 15.70 
Y2 2" du nao, %3 2" dy n 43 ( ) 
€o €o 


To leading order in e,,, the y3 term in (15.70) reproduces (15.26) when (15.15) 
is used for Za, the other two terms in (15.68) cancelling via Z1 = Za. So if, 
as in the case of QCD, Zi is not equal to Z2, we need to introduce the con- 
tributions from loops determining the fermion field strength renormalization 
factor, as well as those related to the vertex parts (together with appropriate 
ghost loops), in addition to the vacuum polarization loop associated in the 
Za. 

Quantities such as y2 and 73 have an interesting and important signifi- 
cance, which we shall illustrate in the case of y2 for QED. Za enters into the 
relation between the propagator of the bare fermion (Q|T (Uo(x)99(0))|Q) and 
the renormalized one, via (cf (11.2)) 


(OjT(À(7)6(0)9) = Z RIT v G)ós(0))9). (15.71) 


where (cf section 10.1.3) |Q) is the vacuum of the interacting theory. The 
Fourier transform of (15.71) is, of course, the Feynman propagator: 


Sp (q^) = f eser )$(0))9). (15.72) 
Suppose we now ask: what is the large —q? behaviour of (15.72) for space-like 


q^, with —q? >> m? where m is the fermion mass? This sounds very similar 
to the question answered in 15.2.3 for the quantity S(|g?|/1?,e,). However, 
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the latter was dimensionless whereas (recalling that i) has mass dimension 3) 
S',(q2) has dimension M-!. This dimensionality is, of course, just what a 
propagator of the free-field form i/(d — m) would provide. 

Accordingly, we extract this (d) ! factor (compare c/c) and consider 
the dimensionless ratio E, (|g?|/u?, a) = 4S. (q?). We might guess that, just 
as for S(|g?|/1?, au), to get the leading large |g?| behaviour we will need to 
calculate Ri, to some order in o;,, and then replace a,, by a(|g?|/u?). But this 
is not quite all. The factor Z in (15.71) will — as noted above — depend on 
the renormalization scale u, just as Z3 of (15.15) did. Thus when we change 


H, the normalization of the js will change via the zi factors — of course by 
a finite amount here — and we must include this change when writing down 
the analogue of (15.33) for this case (i.e. the condition that the ‘total change, 
on changing u, is zero’). The required equation is 


ð £ 
+ Blay)a— tAn) Rp(|q?|/u?, ay) = 0. (15.73) 
Ap H 

The solution of (15.73) is somewhat more complicated than that of (15.33). 
We can gain insight into the essential difference caused by the presence of y2 
by considering the special case 6(a,,) = 0. In this case, we easily find 


B. (lÊ p, ap) ox (u3)72 (9). (15.74) 


But since Rp can only depend on p via |q?|/u?, we learn that if 8 = 0 then 
the large |g?| behaviour of Rj, is given by (|g?|/u2)?? — or, in other words, 
that at large |q?| 


Slate L PHP 15.75 
r (|a i) OG P (15.75) 


Thus, at a zero of the B-function, Se has an ‘anomalous’power law dependence 
on |g?| (i.e. in addition to the obvious d^! factor), which is controlled by the 
parameter y2. The latter is called the ‘anomalous dimension’ of the fermion 
field, since its presence effectively means that the |g?| behaviour of $^. is not 
determined by its ‘normal’ dimensionality Mt. The behaviour (15.75) is 
often referred to as 'scaling with anomalous dimension', meaning that if we 
multiply |q?| by a scale factor A, then $^, is multiplied by \72(¢)—! rather than 
just A^!. Anomalous dimensions turn out to play a vital role in the theory of 
critical phenomena - they are, in fact, closely related to ‘critical exponents’ 
(see section 16.4.3, and Peskin and Schroeder 1995, chapter 13). Scaling with 
anomalous dimensions is also exactly what occurs in deep inelastic scattering 
of leptons from nucleons, as we shall see in section 15.6. 

The full solution of (15.73) for 6 4 0 is elegantly discussed in Coleman 
(1985), chapter 3; see also Peskin and Schroeder (1995) section 12.3. We quote 
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a* 


(a) 


FIGURE 15.6 

Possible behaviour of 8 functions. (a) The slope is positive near the origin (as 
in QED), and negative near a = o*. (b) The slope is negative at the origin 
(as in QCD), and positive near a, = až. 


it here: 
. E t 
Fe (a^ |/12), au) = Ee. a(|q"|/n")) exp d dale). (15.76) 
0 
The first factor is the expected one from section 15.2.3; the second results 
from the addition of the y2 term in (15.73). Suppose now that (a) has a 


zero at some point a*, in the vicinity of which 8(o) ~ —B(a—a*) with B > 0. 
Then, near this point the evolution of a is given by (cf (15.39)) 


soa pale des 
l = = — 15.77 
n(lel/i?) L ES (15.77) 
which implies 
a(|q?|) = a* + constant. x (y? /|g?|)P. (15.78) 


Thus asymptotically for large |q?|, the coupling will evolve to the ‘fixed point’ 
a*. In this case, at sufficiently large —q?, the integral in (15.76) can be eval- 
uated by setting a(t’) = o*, and Rip will scale with an anomalous dimension 
^o(a*) determined by the fixed point value of a. The behaviour of such an 
a is shown in figure 15.6(a). We emphasize that there is no reason to believe 
that the QED £ function actually does behave like this. 

The point o* in figure 15.6(a) is called an ultraviolet-stable fixed point: 
a ‘flows’ towards it at large |q?|. In the case of QCD, the 8 function starts 
out negative, so that the corresponding behaviour (with a zero at a az Æ 0) 
would look like that shown in figure 15.6(b). In this case, the reader can check 
(problem 15.5) that až is reached in the infrared limit q? — 0, and so až is 
called an infrared-stable fixed point. Clearly it is the slope of 8 near the fixed 
point that determines whether it is u-v or i-r stable. This applies equally to 
a fixed point at the origin, so that QED is i-r stable at a = 0 while QCD is 
u-v stable at a, = 0. 
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We must now point out to the reader an error in the foregoing analysis, in 
the case of a gauge theory. The quantity Z is not gauge invariant in QED 
(or QCD), and hence y2 depends on the choice of gauge. This is really no 
surprise, because the full fermion propagator itself is not gauge invariant (the 
free-field propagator is gauge invariant, of course). What ultimately matters 
is that the complete physical amplitude for any process, at a given order of 
a, be gauge invariant. Thus the analysis given above really only applies — in 
this simple form — to non-gauge theories, such as the ABC model of chapter 
6, or to gauge-invariant quantities. 

This is an appropriate point at which to consider the treatment of quark 
masses in the RGE-based approach. Up to now we have simply assumed 
that the relevant |q?| is very much greater than all quark masses, the latter 
therefore being neglected. While this may be adequate for the light quarks u, 
d, s, it seems surely a progressively worse assumption for c, b and t. However, 
in thinking about how to re-introduce the quark masses into our formalism, 
we are at once faced with a difficulty: how are they to be defined? For an 
unconfined particle, such as a lepton, it seems natural to define ‘the’ mass as 
the position of the pole of the propagator (i.e. the ‘on-shell’ value p? = m?), a 
definition we followed in chapters 10 and 11. Significantly, renormalization is 
required (in the shape of a mass counter-term) to achieve a pole at the ‘right’ 
physical mass m, in this sense. But this prescription is inherently perturbative, 
and cannot be used for a confined particle, which never ‘escapes’ beyond the 
range of the non-perturbative confining forces, and whose propagator can 
therefore never approach the form ~ (ø — m)^! of a free particle. 

Our present perspective on renormalization suggests an obvious way for- 
ward. Just as there was, in principle, no necessity to define the QED coupling 
parameter e via an on-shell prescription, so here a mass parameter in the La- 
grangian can be defined in any way we find convenient; all that is necessary 
is that it should be possible to determine its value from some measurable 
quantity (for example, quark masses from lattice QCD predictions of hadron 
masses). Effectively, we are regarding the ‘m in a term such as —mu(xz)i(a) 
as a ‘coupling constant’ having mass dimension 1 (and, after all, the ABC 
coupling itself had mass dimension 1). Incidentally, the operator 4)(x)4)(a) is 
gauge invariant, as is any such local operator. Taking this point of view, it is 
clear that a renormalization scale will be involved in such a general definition 
of mass, and we must expect to see our mass parameters ‘evolve’ with this 
scale, just as the gauge (or other) couplings do. In turn, this will get trans- 
lated into a |g?|-dependence of the mass parameters, just as for o(|g?]) and 


os(|g?|). 
The RGE in such a scheme now takes the form 


o o o 
2. 0. 0 ace 2) 7/2 E 
H Ope + Blas) 5 - + > Vilas) + ?m(as)m— R(|q"|/ p^, os, m/|g]) = 0 


(15.79) 
where the partial derivatives are taken at fixed values of the other two vari- 
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ables. Here the y; are the anomalous dimensions relevant to the quantity R, 
and ym is an analogous ‘anomalous mass dimension’, arising from finite shifts 
in the mass parameter when the scale p? is changed. Just as with the solution 
(15.76) of (15.73), the solution of (15.79) is given in terms of a ‘running mass’ 
m(|q?|). Formally, we can think of ym in (15.79) as analogous to B(as) and 
ln m as analogous to as. Then equation (15.41) for the running o5, 


Oas(|q? 

Ses D — (a (gn) (15.80) 
where t = In(|g?|/u?), becomes 

O(In m(|q? 
eT) — ^v (as (lal?) (15.81) 

Equation (15.81) has the solution 

2 2 uut 2 2 
mí) = m(u2)exp f amla? ss (os (l7. (15.82) 
In 42 


To one-loop order in QCD, ^,,(o&) turns out to be —+a, (Peskin and 
Schroeder 1995, section 18.1). Inserting the one-loop solution for o, in the 
form (15.53), we find 


| Mn (15.83) 


mi) = me) SET 


where (189)! = 12/(33 — 2N;). Thus the quark masses decrease logarithmi- 
cally as |q?| increases, rather like o&(|g?]). It follows that, in general, quark 
mass effects are suppressed both by explicit m?/|q?| factors, and by the log- 
arithmic decrease given by (15.83). Further discussion of the treatment of 
quark masses is contained in Ellis, Stirling and Webber (1996), section 2.4; 
see also the review by Manohar and Sachrajda in Nakamura et al. 2010. 


E: SSe 


15.6 QCD corrections to the parton model predictions 
for deep inelastic scattering: scaling violations 


As we saw in section 9.2, the parton model provides a simple intuitive expla- 
nation for the experimental observation that the nucleon structure functions 
in deep inelastic scattering depend, to a good first approximation, only on 
the dimensionless ratio x = Q?/2Mv, rather than on Q? and v separately; 
this behaviour is referred to as ‘scaling’. Here M is the nucleon mass, and Q? 
and v are defined in (9.7) and (9.8). In this section we shall show how QCD 
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corrections to the simple parton model, calculated using RGE techniques, pre- 
dict observable violations of scaling in deep inelastic scattering. As we shall 
see, comparison between the theoretical predictions and experimental mea- 
surements provides strong evidence for the correctness of QCD as the theory 
of nucleonic constituents. 


15.6.1 Uncancelled mass singularities at order ag. 


The free parton model amplitudes we considered in chapter 9 for deep inelastic 
lepton-nucleon scattering were of the form shown in figure 15.7 (cf figure 9.4). 
The obvious first QCD corrections will be due to real gluon emission by either 
the initial or final quark, as shown in figure 15.8, but to these we must add the 
one-loop virtual gluon processes of figure 15.9 in order (see below) to get rid 
of infrared divergences similar to those encountered in section 14.4.2, and also 
the diagram of figure 15.10, corresponding to the presence of gluons in the 
nucleon. To simplify matters, we shall consider what is called a ‘non-singlet 
structure function’ FYS, such as F5? — F2" in which the (flavour) singlet gluon 
contribution cancels out, leaving only the diagrams of figures 15.8 and 15.9. 

We now want to perform, for these diagrams, calculations analogous to 
those of section 9.2, which enabled us to find the e-N structure functions 
vW, and MW, from the simple parton process of figure 15.7. There are two 
problems here: one is to find the parton level W’s corresponding to figure 
15.8 (leaving aside figure 15.9 for the moment) — cf equations (9.29) and 
(9.30) in the case of the free parton diagram figure 15.7; the other is to relate 
these parton W’s to observed nucleon W’s via an integration over momentum 
fractions. In section 9.2 we solved the first problem by explicitly calculating 
the parton level d?c?/dQ?dv and picking off the associated vWi, Wi. In 
principle, the same can be done here, starting from the five-fold differential 
cross section for our e. + q —> e +q + g process. However, a simpler — if 
somewhat heuristic — way is available. We note from (9.46) that in general 
Fı = MW, is given by the transverse virtual photon cross section 


Wo cod ao ; Y, AAW (15.84) 


=F 


where W#” was defined in (9.3). Further, the Callan-Gross relation is still 
true (the photon only interacts with the charged partons, which are quarks 
with spin 4 and charge e;), and so 


Fj/z = 2F, = 2MW, = oy /(41?o0/2M K). (15.85) 


These formulae are valid for both parton and proton W;'s and W "s, with ap- 
propriate changes for parton masses M. Hence the parton level 2f; for figure 
15.8 is just the transverse photon cross section as calculated from the graphs 
of figure 15.11, divided by the factor 4n?0/2MK, where as usual ‘^’ denotes 
kinematic quantities in the corresponding parton process. This cross section, 
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FIGURE 15.7 
Electron-quark scattering via one-photon exchange. 


“SS 


FIGURE 15.8 
Electron-quark scattering with single-gluon emission 


p^ 


FIGURE 15.9 
Virtual single-gluon corrections to figure 15.7. 


FIGURE 15.10 
Electron-gluon scattering with qq production. 
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FIGURE 15.11 


Virtual photon processes entering into figure 15.8. 


however, is — apart from a colour factor — just the virtual Compton cross sec- 
tion calculated in section 8.6. Also, taking the same (Hand) convention for 
the individual photon flux factors, 


IMK = à. (15.86) 


Thus for the parton processes of figure 15.9, 


2F, = 67/(4n2a/2MK) 
P 1 2. hi ^ ^()2 
$ 4 mej^aos(q^) t 38 2aQ 
= S à uit ————— a E ra 1 * 
=| dcos 0 3 z ( ram +— (15.87) 


where, in going from (8.181) to (15.87), we have inserted a colour factor i 
(problem 14.5 (a)), renamed the variables ê > à, à — Ê in accordance with 
figure 15.11, and replaced o? by e;?aas(). 

Before proceeding with (15.87), it is helpful to consider the other part of 
the calculation — namely the relation between the nucleon F} and the parton 
F4. We mimic the discussion of section 9.2, but with one significant difference: 
the quark ‘taken’ from the proton still has momentum fraction y (momentum 
yp), but now its longitudinal momentum must be degraded in the final state 
due to the gluon bremsstrahlung process we are calculating. Let us call the 
quark momentum after gluon emission zyp (figure 15.12). Then, assuming as 
in section 9.2 that it stays on-shell, we have 


qd? +2zyq-p=0 (15.88) 


or 
z-—yz £=Q?/2¢:p, q--Q (15.89) 
and we can write (cf (9.31)) 


1 1 
B =2F, = D aufi) | dz 2F#6(a — yz) (15.90) 


where the f;(y) are the parton distribution functions introduced in section 9.2 
(we often call them q(x) or g(x) as the case may be) for parton type i, and 


15.6. QCD corrections to the parton model predictions: scaling violations 139 


FIGURE 15.12 
The first process of figure 15.11, viewed as a contribution to e~-nucleon scat- 
tering. 


FIGURE 15.13 
Kinematics for the parton process of figure 15.12. 


the sum is over contributing partons. The reader may enjoy checking that 
(15.90) does reduce to (9.34) for free partons by showing that in that case 
2Fi = e20(1— z) (see Halzen and Martin 1984, section 10.3, for help), so that 
2FPe = D, e? fi(a). 

To proceed further with the calculation (i.e. of (15.87) inserted into (15.90)), 
we need to look at the kinematics of the yq — qg process, in the CMS. Re- 
ferring to figure 15.13, we let k, k' be the magnitudes of the CMS momenta 
k, k'. Then 


§ = 4k? —(yprtay-Qu-z/zs 2=Q?/(8+Q*) 
Ê = (q—p’')? —-2kk'(1— cos0) = —Q?(1— c)/2z, c= cos0 
à = (q-q)? =—2kk'(1 + cos) = —Q?(1 + c)/2z. (15.91) 


We now note that in the integral (15.87) for Pj, when we integrate over 
c = cos 0, we shall obtain an infinite result 
! dc 

1—-c 


(15.92) 


associated with the vanishing of £ in the ‘forward’ direction (i.e. when q and p' 
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are parallel). This is a divergence of the ‘collinear’ type, in the terminology of 
section 14.4.2 — or, as there, a ‘mass singularity’, occurring in the zero quark 
mass limit. If we simply replace the propagator factor £7! = [(q — p')?]-! by 
[((q — p)? — m?]~+, where m is a quark mass, then (15.92) becomes 


z dc 
z J TTXSJgU-— (15.93) 


which will produce a factor of the form In(Q? /m?) as m? — 0. Thus m reg- 
ulates the divergence. We have here an uncancelled mass singularity, and it 
violates scaling. This crucial physical result is present in the lowest-order QCD 
correction to the parton model, in this case. As we are learning, such loga- 
rithmic violations of scaling are a characteristic feature of all QCD corrections 
to the free (scaling) parton model. 

We may calculate the coefficient of the In Q? term by retaining in (15.87) 
only the terms proportional to f~t: 


1 2 2 
~i de fas(u*) 4 1+z 
fiwe J -Æ ogee 15.94 

: af -= 27 3 1-z pd 


and so, for just one quark species, this QCD correction contributes (from 
(15.90)) a term 


ae f Halu) (Paso) In(Q?/m?) + C(2/y)} (15.95) 
to 2F1, where ; 
Pag £ (= ) (15.96) 


and C(z/y) has no mass singularity. 
Our result so far is therefore that the ‘free’ quark distribution function 
q(x), which depended only on the scaling variable x, becomes modified to 


q(x) + U 2 dy oy) { Paa(/y) In (Q?/m?) +C(x/y)} (15.97) 


2m Js y 
= ofc) + SP f ay f aed ey - sat Peale) n? m^ 
qo (15.98) 


due to lowest-order gluon radiation. Clearly, this corrected distribution func- 
tion violates scaling because of the In Q? term. But the result as it stands 
cannot represent a well-controlled approximation, since it contains divergences 
as z > 1 and as m? —> 0. 

We postpone discussion of the mass divergence until the next section. The 
divergence as z — 1 is a standard infrared divergence (the quark momentum 
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yzp after gluon emission becomes equal to the quark momentum yp before 
emission), and we expect that it can be cured by including the virtual gluon 
diagrams of figure 15.9, as indicated at the start of the section (and as was 
done analogously in the case of e*e^ annihilation). This has been verified 
explicitly by Kim and Schilcher (1978) and by Altarelli et al. (1978 a, b; 
1979). Alternatively, we follow the procedure of Altarelli and Parisi (1977). 
First we regulate the divergence as z — 1 by defining a regulated function 
1/(1— z)4 such that 


| PR a= [Ee mE Oa- fi mi -eas ee 


where f(z) is any test function sufficiently regular at the end points. Now the 
gluon loops which will cancel the i-r divergence only contribute at z — 1, in 
leading log approximation. Thus the i-r finite version of P44 has the form 


P 4 142? 


aa(2) = 30a, 40-2 (15.100) 


The coefficient A is determined by the physical requirement that the net 
number of quarks (i.e. the number of quarks minus the number of antiquarks) 
does not vary with Q?. From (15.98) this implies 


[ Pag (z)dz = 0. (15.101) 
0 


Inserting (15.100) into (15.101), and using (15.99), we find (problem 15.6) 


A=?, (15.102) 
so that 
Pag(2) = ae) + 26(1 — 2). (15.103) 


The function P44 is called a ‘splitting function’, and it has an impor- 
tant physical interpretation. The quantity o&(u?)/(2«) P4q(z) is, for z < 1, 
the probability that, to first order in as, a quark having radiated a gluon is 
left with a fraction z of its original momentum. Similar functions arise in 
QED in connection with what is called the ‘equivalent photon approximation’ 
(Weizsücker 1934, Williams 1934, Chen and Zerwas 1975). The application 
of these techniques to QCD corrections to the free parton model is due to 
Altarelli and Parisi (1977), who thereby opened the way to this simpler and 
more physical way of understanding scaling violations, which had previously 
been discussed mainly within the rather technical operator product formalism 
(Wilson 1969). 

We must now find some way of making sense, physically, of the uncancelled 
mass divergence in (15.97). 
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15.6.2 Factorization, and the order o, DGLAP equation 


The key is to realize that when two partons are in the collinear configuration 
their relative momentum is very small, and hence the interaction between 
them is very strong, beyond the reach of a perturbative calculation. This 
suggests that we should absorb such uncalculable effects into a modified dis- 
tribution function q(x, 2) given by 


2 as(u?) : dy 2 2 
qlz, ug) = a(z) + —5- y 10) Pan/v) {In(up/m*) + C(z/y)] 
(15.104) 
which we have to take from experiment. Note that we have also absorbed the 
non-singular term C(x/y) into q(x, ug). In terms of this quantity, then, we 
have 


F,(z,Q°) = eaq(x,Q?) (15.105) 
= sd f Paty, ub) oa -ay + SE n op mQ} 
(15.106) 


to this order in o&, and for one quark type. 

'This procedure is, of course, very reminiscent of ultraviolet renormaliza- 
tion, in which u-v divergences are controlled by similarly importing some 
quantities from experiment. In this example, we have essentially made use of 
the simple fact that 


In(Q?/m?) = In(Q? /u£) 4-In(ud /m?). (15.107) 


The arbitrary scale up is analogous to renormalization scale u (which we have 
retained in o&(u?)), and is here referred to as a ‘factorization scale’. It is 
the scale entering into the separation in (15.107), between one (uncalculable) 
factor which depends on the i-r parameter m but not on Q?, and the other 
(calculable) factor which depends on Q?. The scale up can be thought of 
as one which separates the perturbative short-distance physics from the non- 
perturbative long-distance physics. Thus partons emitted at small transverse 
momenta < pp (i.e. approximately collinear processes) should be considered 
as part of the hadron structure, and are absorbed into q(x, u2). Partons emit- 
ted at large transverse momenta contribute to the short-distance (calculable) 
part of the cross section. Just as for the renormalization scale, the more terms 
that can be included in the perturbative contributions to the mass-singular 
terms, the weaker the dependence on up will be. We have demonstrated the 
possibility of factorization only to O(o&), but proofs to all orders in pertur- 
bation theory exist; reviews are provided by Collins and Soper (1987, 1988). 

Returning now to (15.106), the reader can guess what is coming next: 
we shall impose the condition that the physical quantity F5(x,Q?) must be 
independent of the choice of factorization scale uz. Differentiating (15.106) 
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partially with respect to 42, and setting the result to zero, we obtain (to order 
os on the right-hand side) 


Oq(x, Ue) — os(u?) ri dy 
2 dala up) _ dy ; 
HF ui ox J, y Pr ato ui (15.108) 


This equation is the analogue of equation (15.35) describing the running of 
the coupling o4 with u?, and is a fundamental equation in the theory of 
perturbative applications of QCD. It is called the DGLAP equation, after 
Dokshitzer (1977), Gribov and Lipatov (1972), and Altarelli and Parisi (1977). 
The above derivation is not rigorous: a more sophisticated treatment (Georgi 
and Politzer 1974, Gross and Wilczek 1974) confirms the result and extends 
it to higher orders. 

Equation (15.108) shows that, although perturbation theory cannot be 
used to calculate the distribution function q(x, u) at any particular value 
u2 = uà, it can be used to predict how the distribution changes (or ‘evolves’) as 
uê varies. (We recall from (15.105) that q(x, u2) can be found experimentally 
via zq(z, ug) = 2F»(r, Q? = u2)/e2.) As in the case of c(e* e^ — hadrons) 
and the scale u?, the choice of factorization scale is arbitrary, and would 
cancel from physical quantities if all powers in the perturbation series were 
included. Truncating at N terms results in an ambiguity of order of *D. In 
deep inelastic predictions, the standard choice for scales is p? = u} = Q?. 

The way the non-singlet distribution changes can be understood qualita- 
tively as follows. The change in the distribution for a quark with momentum 
fraction z, which absorbs the virtual photon, is given by the integral over y of 
the corresponding distribution for a quark with momentum fraction y, which 
radiated away (via a gluon) a fraction z/y of its momentum with probabil- 
ity (as/27) P4q(r/y). This probability is high for large momentum fractions: 
high-momentum quarks lose momentum by radiating gluons. Thus there is 
a predicted tendency for the distribution function q(x, u?) to get smaller at 
large x as u? increases, and larger at small x (due to the build-up of slower 
partons), while maintaining the integral of the distribution over x as a con- 
stant. The effect is illustrated qualitatively in figure 15.14. In addition, the 
radiated gluons produce more qq pairs at small x. Thus the nucleon may be 
pictured as having more and more constituents, all contributing to its total 
momentum, as its structure is probed on ever smaller distance (larger ju) 
scales. 

In general, the right-hand side of (15.108) will have to be supplemented 
by terms (calculable from figure 15.10) in which quarks are generated from 
the gluon distribution; the equations must then be closed by a corresponding 
one describing the evolution of the gluon distributions (Altarelli 1982). In the 
now commonly used notation, this generalization of (15.108) reads 


Ofisp(@; ui) as (UE 'd 
us y SUD S "heute. — 5109 
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xq NS(x) 


FIGURE 15.14 
Evolution of the distribution function with p:?. 


where the sum is over quark types q and gluons g, P is the j > i splitting 


function to this order, and f;/p is the parton distribution function for partons 


of type i in the proton. In our previous notation, P a (£/y) = Pyq(x/y), 
and fa/p(v, up) = q(x, ub). The other splitting functions may be found in 
Altarelli (1982). 

Both the splitting functions and expression (15.106) for F»(r, Q?) can be 
extended to higher orders in a,. Thus the perturbative expansion (15.106) 
becomes 


Fy(x,Q?) = a eee 2 s BOH (s. Q?, n fis n2, iB), (15.110) 
=0 


where we have chosen u = up. The expansion (15.110) is analogous to (15.63), 
and as in that case the coefficient functions will depend on u2 in such a way 
that, order by order, the u2 dependence will cancel. At zeroth order the 
coefficients are the j/2-independent free parton ones, few = e20(1 — z) and 
Go = 0. In most cases the coefficients have been calculated up to order a2 
(Nakamura et al. 2010). 

We ought also to mention that there are in principle non-perturbative 
corrections to both (15.63) and (15.110), which are of order (AZ _/Q*)* and 
(AZ, /Q*) respectively. 
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FIGURE 15.15 

Q?-dependence of the proton structure function F? for various fixed x values 
(Hagiwara et al. 2002). i, is a number depending on the z-bin, ranging from 
iz = 1 (x 0.85) to i, = 28 (x = 0.000063). Figure reprinted with permission 
from K Hagiwara et al. Phys.Rev. D 66 010001 (2002). Copyright 2002 by 
the American Physical Society. 


15.6.3 Comparison with experiment 


Data on nucleon structure functions do indeed show the trend described in 
the previous section. Figure 15.15 shows the Q?-dependence of the proton 
structure function F} (x, Q?) = >> e?z fij, (x, Q?) for various fixed x values, as 
compiled by B. Foster, A.D. Martin and M.G. Vincter for the 2002 Particle 
Data Group review (Hagiwara et al. 2002). Clearly at larger x (x > 0.13) the 
function gets smaller as Q? increases, while at smaller x it increases. 
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Fits to the data have been made in various ways. One (theoretically con- 
venient) way is to consider ‘moments’ (Mellin transforms) of the structure 
functions, defined by 


1l 
=| dxa”—' q(x, t), (15.111) 
0 


where we have taken ju? = u2 and introduced the variable t = Iny?. Taking 
moments of both sides of (15.108) and interchanging the order of the x and y 
integrations, we find 


dM; (t 


” dz n—1 
"n wt) f Cer" País. — Q5112) 


Changing the variable to z = z/y in the second integral, and defining? 


1 
eg = af deg PU (15.113) 
we obtain j " 
M? (t a (t) 
q ates n n 
Tammie aA (t). (15.114) 


Thus the integral in (15.108) — which is of convolution type — has been reduced 
to product form by this transformation. Now we also know from (15.47) and 
(15.48) that 
das 
dt 
with Bg = (33 — 2Nr)/127 as usual, to this (one-loop) order. Thus (15.114) 
becomes 


= — boo; (15.115) 


dln M? y2 
a I gt : 15.116 
dlnog 8m Bo aq: SAY ( ) 


The solution to (15.116) is easily found to be 


dà 
M2(t) = Mz (to) (=%) . (15.117) 


Applying the prescription (15.99) to Yn, we find (problem 15.9) 


mana de ESN Hu (15.118) 


“The notation is not chosen accidentally: the y’s are indeed anomalous dimensions of 
certain operators which appear in Wilson's operator product approach to scaling violations 
(Wilson 1969); interested readers may pursue this with Peskin and Schroeder 1995, chapter 
18. 
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FIGURE 15.16 

Distributions of x times the unpolarized parton distributions f(x, u?) (where 
f = Uy, dy, ti, d,s,¢,g) using the MRST2001 parametrization (Martin et al. 
2002) at a scale u? = 10GeV?. Figure reprinted with permission from K 
Hagiwara et al. Phys. Rev. D 66 010001 (2002). Copyright 2002 by the 
American Physical Society. 


and then 


4 
di = L———— |1- —— 15.119 
71 33 —2N¢ TT HE ( ) 


We emphasize again that all the foregoing analysis is directly relevant 
only to distributions in which the flavour singlet gluon distributions do not 
contribute to the evolution equations. In the more general case, analogous 
splitting functions Pag, Psq and Peg will enter, folded appropriately with the 
gluon distribution function g(x, t), together with the related quantities Väg, "za 
and 42,. Equation (15.108) is then replaced by a 2 x 2 matrix equation for 
the evolution of the quark and gluon moments M? and M7. 

Returning to (15.117), one way of testing it is to plot the logarithm of one 
moment, In M, q> Versus the logarithm of another, ln Mi, for different n,m 
values. A more direct procedure, applicable to the non-singlet case too of 
course, is to choose a reference point u and parametrize the parton distribu- 
tion functions f;(x,to) in some way. These may then be evolved numerically, 
via the DGLAP equations, to the desired scale. Figure 15.16 shows a typical 
set of distributions at u? = 10 GeV? (Martin et al. 2002). A global numerical 
fit is then performed to determine the best values of the parameters, including 
the parameter Ayqg which enters into a,(t). An example of such a fit, due to 
Martin et al. (1994), is shown in figure 15.17. 
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FIGURE 15.17 

Data on the structure function Fə in muon-proton deep inelastic scattering, 
from BCDMS (Benvenuti et al. 1989) and NMC (Amaudruz et al. 1992). 
The curves are QCD fits (Martin et al. 1994) as described in the text. Figure 
reprinted with permission from A D Martin et al. Phys. Rev. D 50 6734 
(1994). Copyright 1994 by the American Physical Society. 


It may be worth pausing to reflect on how far our understanding of struc- 
ture has developed, via quantum field theory, from the simple ‘fixed number 
of constituents' models which are useful in atomic and nuclear physics. When 
nucleons are probed on finer and finer scales, more and more partons (gluons, 
qq pairs) appear, in a way quantitatively predicted by QCD. The precise ex- 
perimental confirmation of these predictions (and many others, as discussed 
by Ellis, Stirling and Webber 1996, for example) constitutes a remarkable vote 
of confidence, by Nature, in relativistic quantum field theory. 
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a 
Problems 
15.1 Verify equation (15.10). 
15.2 Verify equation (15.27). 
15.3 Check that (15.50) can be rewritten as (15.53). 
15.4 (a) Verify (15.61). (b) Show that the next term in the expansion (15.60) 
is 

(b2 — bi) M 

Bo 


where bo = (2/0. By iteratively solving the resulting modified equation 
(15.60), show that the corresponding correction to (15.61) is 


1 
gars lb (im I? = In L — 1) + b2]. 


15.5 Verify that for the type of behaviour of the 6 function shown in figure 
15.7(b), až is reached as q? — 0. 
15.6 Verify equation (15.102). 


15.7 Check that the electromagnetic charge e has dimension (mass)‘/? in 
d = 4 — e dimensions. 


15.8 Verify equation (O.20) in appendix O. 
15.9 Verify equation (15.118). 


This page intentionally left blank 


16 


Lattice Field Theory, and the 
Renormalization Group Revisited 


16.1 Introduction 


Throughout this book, thus far, we have relied on perturbation theory as the 
calculational tool, justifying its use in the case of QCD by the smallness of the 
coupling constant at short distances; note, however, that this result itself re- 
quired the summation of an infinite series of perturbative terms. As remarked 
in section 15.3, the concomitant of asymptotic freedom is that a, really does 
become strong at small Q?, or at long distances of order Aus ~ 1 fm. Here we 
have no prospect of getting useful results from perturbation theory: it is the 
non-perturbative regime. But this is precisely the regime in which quarks bind 
together to form hadrons. If QCD is indeed the true theory of the interaction 
between quarks, then it should be able to explain, ultimately, the vast amount 
of data that exists in low energy hadronic physics. For example: what are 
the masses of mesons and baryons? Are there novel colourless states such as 
glueballs? Is SU(2); or SU(3)¢ chiral symmetry spontaneously broken? What 
is the form of the effective interquark potential? What are the hadronic form 
factors, in electromagnetic (chapter 9) or weak (chapter 20) processes? 

After more than 30 years of theoretical development, and machine ad- 
vances, numerical simulations of lattice QCD are now yielding precise answers 
to many of these questions, thereby helping to establish QCD as the correct 
theory of the strong interactions of quarks, and also providing reliable input 
needed for the discovery of new physics. Lattice QCD is a highly mature 
field, and many technical details are beyond our scope. Rather, in this chap- 
ter we aim to give an elementary introduction to lattice field theory in general, 
including some important insights that it generates concerning the renormal- 
ization group. We return to QCD in the final section, with some illustrative 
results. 

In thinking about how to formulate a non-perturbative approach to quan- 
tum field theory, several questions immediately arise. First of all, how can we 
regulate the ultraviolet divergences, and thus define the theory, if we cannot 
get to grips with them via the specific divergent integrals supplied by per- 
turbation theory? We need to be able to regulate the divergences in a way 
which does not rely on their appearance in the Feynman graphs of pertur- 
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bation theory. As Wilson (1974, 1975) was the first to propose, one quite 
natural non-perturbative way of regulating ultraviolet divergences is to ap- 
proximate continuous space-time by a discrete lattice of points. Such a lattice 
will introduce a minimum distance — namely the lattice spacing ‘a’ between 
neighbouring points. Since no two points can ever be closer than a, there is 
now a corresponding maximum momentum A = 7/a (see following equation 
(16.6)) in the lattice version of the theory. Thus the theory is automatically 
ultraviolet finite from the start, without presupposing the existence of any 
perturbative expansion; renormalization questions will, however, enter when 
we consider the a dependence of our parameters. As long as the lattice spac- 
ing is much smaller than the physical size of the hadrons one is studying, 
the lattice version of the theory should be a good approximation. Of course, 
Lorentz invariance is sacrificed in such an approach, and replaced by some 
form of hypercubic symmetry; we must hope that for small enough a this will 
not matter. We shall discuss how simple field theories are ‘discretized’ in the 
next section; scalar fields, fermion fields, and gauge fields each require their 
own prescriptions. 

Next, we must ask how a discretized quantum field theory can be formu- 
lated in a way suitable for numerical computation. Any formalism based on 
non-commuting operators seems to be ruled out, since it is hard to see how 
they could be numerically simulated. Indeed, the same would be true of ordi- 
nary quantum mechanics. Fortunately a formulation does exist which avoids 
operators: Feynman’s sum over paths approach, which was briefly mentioned 
in section 5.2.2. This method is the essential starting point for the lattice ap- 
proach to quantum field theory, and it will be introduced in section 16.3. The 
sum over paths approach does not involve quantum operators, but fermions 
still have to be accommodated somehow. The way this is done is briefly 
described in section 16.3: see also appendix P. 

It turns out that this formulation enables direct contact to be made be- 
tween quantum field theory and statistical mechanics, as we shall discuss in 
section 16.3.3. This relationship has proved to be extremely fruitful, allowing 
physical insights and numerical techniques to pass from one subject to the 
other, in a way that has been very beneficial to both. In section 16.4 we make 
a worthwhile detour to explore the physics of renormalization and of the RGE 
from a lattice/statistical mechanics perspective, before returning to QCD in 
section 16.5. 


ÁÁ]. — 
16.2 Discretization 


16.2.1 Scalar fields 


We start by considering a simple field theory involving a scalar field à. Post- 
poning until section 16.3 the question of exactly how we shall use it, we assume 
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that we shall still want to formulate the theory in terms of an action of the 
form 


S= [ate Llo, Vo, à). (16.1) 


It seems plausible that it might be advantageous to treat space and time 
as symmetrically as possible, from the start, by formulating the theory in 
‘Euclidean’ space, instead of Minkowskian, by introducing t = —ir; further 
motivation for doing this will be provided in section 16.3. In that case, the 
action (16.1) becomes 


S 5 -i f dna £(6, v9,159) (16.2) 
= JE Lr = iSg. (16.3) 

A typical free scalar action is then 
Selo) = 5 | ear [0-9 + (Vo)? eng]. (16.4) 


We now represent all of space-time by a finite-volume ‘hypercube’. For 
example, we may have N; lattice points along the x-axis, so that a field (x) 
is replaced by the N4 numbers ¢(nia) with n; = 0,1,...N; — 1. We write 
L = Nia for the length of the cube side. In this notation, integrals and 
differentials are replaced by the finite sums and difference expressions 


n > a ; 2 > “[o(m +1) — ó(n1)], (16.5) 


so that a typical integral (in one dimension) becomes 


n (By do 5 dy Fa) C (16.6) 


ni 


As in all our previous work, we can alternatively consider a formulation in 
momentum space, which will also be discretized. It is convenient to impose 
periodic boundary conditions such that $(x) = ¢(a + L). Then the allowed 
k-values may be taken to be k,, = 2z114/L with vı = —N1/241,...0,... N1/2 
(we take N; to be even). It follows that the maximum allowed magnitude of 
the momentum is then 7/a, indicating that a^! is (as anticipated) playing 
the role of our earlier momentum cut-off A. We then write 


1 ; E 
Om) = Da Ee Mo) (16.7) 


Vi 


which has the inverse 


Me 


ia) - (£) Deen oe), (16.8) 


ni 


154 16. Lattice Field Theory, and the Renormalization Group Revisited 


since (problem 16.1) 


Nı—1 
» ei2vni (i -va)/ Ni = On va (16.9) 


n1=0 


1 
Ni 


Equation (16.9) is a discrete version of the ó-function relation given in (E.25) of 
volume 1. A one-dimensional version of the mass term in (16.4) then becomes 
(problem 16.2) 


5 | de móla)? > 5m? > bn), (16.10) 


while 


Thus a one-dimensional version of the free action (16.4) is 
l4 4 sin? (kn, a/2) x 
3 2, (54) MAIO tm? | $(—ky,). (16.13) 
kv 
In the continuum case, (16.13) would be replaced by 


; Í E Stw) [k? + m?] à(—k) (16.14) 


as usual, which implies that the propagator in the discrete case is proportional 


to 
E 


2 
ee e (16.15) 


a 


rather than to [k? + m?| zm (remember we are in one-dimensional Euclidean 
space). The two expressions do coincide in the continuum limit a — 0. The 
manipulations we have been going through will be easily recognized by readers 
familiar with the theory of lattice vibrations and phonons, and lead to a 
satisfactory discretization of scalar fields. For Dirac fields the matter is not 
so straightforward. 


16.2.2 Dirac fields 


The first obvious problem has already been mentioned: how are we to rep- 
resent such entirely non-classical objects, which obey anticommutation rela- 
tions? This is part of the wider problem of representing field operators in 
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a form suitable for numerical simulation, which we defer until section 16.3. 
There is, however, a quite separate problem which arises when we try to repeat 
for the Dirac field the discretization used for the scalar field. 

First note that the Euclidean Dirac matrices YE are related to the usual 
Minkowski ones 2/4 by veo 3 = —iyb a, VE = —mM = «M. They satisfy 
i91] = 26,, for u = 1,2,3,4. The Euclidean Dirac Lagrangian is then 
W(x) [YE8,, +m] (x), which should be written now in Hermitean form 


miley (a) + E (HePAavla) - (G.(e))7EV@)}. (16.16) 


The corresponding ‘one-dimensional’ discretized action is then 
7 a n p [Y(n +1) - y(n) 
aÐ mimba) + 213,90 | 2 — n9 


- X (EEE) Pom) cas.) 


= aS [métn)etu) + (Blea) Pole + 1) - (i + DP] d. 


(16.18) 
In momentum space this becomes (problem 16.3) 
= . p sin(k),a E 
S bb) |? nha t) + m (ky, ), (16.19) 
kv 


and the inverse propagator is [n BD + m| . Thus the propagator itself 


is 


m- iyP mL / p? + d (16.20) 


a a? 


But here there is a problem: in addition to the correct continuum limit (a — 0) 
found at k,, — 0, an alternative finite a — 0 limit is found at ky, — T/a 
(consider expanding a~' sin [((1/a — 6)a] for small 5). Thus two modes survive 
as a — 0, a phenomenon known as the ‘fermion doubling problem’. Actually 
in four dimensions there are sixteen such corners of the hypercube, so we have 
far too many degenerate lattice copies (which are called different ‘tastes’, to 
distinguish them from the real quark flavours). 

Various solutions to this problem have been proposed. Wilson (1975), for 
example, suggested adding the extra term 


-4r 5 Bl) wr F1) + (ny — 1) — 29(ni)] (16.21) 
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to the fermion action in this one-dimensional case, where r is dimensionless. 
Evidently this is a second difference, and it would correspond to the term 


-2ra J dêædr j(z)(02 + V? yh) (16.22) 


in the four-dimensional continuum action. Note the presence of the lattice 
spacing ‘a’ in (16.22), which ensures its disappearance as a — 0. The higher- 
derivative term (0? + V)y has mass dimension 5, and therefore requires a 
coupling constant with mass dimension -1, ie. a length in units A = c = 1; 
it is, in fact, a non-renormalizable term. However, if we recall the discus- 
sion of section 11.8 in volume 1, we would expect it to be suppressed at low 
momenta much less than the cut-off 7/a. Hence it is natural to see a cou- 
pling proportional to a appearing in (16.22). (We shall see in section 16.5.3 
how renormalization group ideas provide a different perspective on such non- 
renormalizable interactions, classifying them as ‘irrelevant’). 

How does the extra term (16.21) help the doubling problem? One easily 
finds that it changes the (one-dimensional) inverse propagator to 


s sin(k,, 
iB inik 18) cae pee “(1 — cos(k,, a)). (16.23) 


By considering the expansion of the cosine near k,, œ% 0 it can be seen that 
the second term disappears in the continuum limit, as expected. However, 
for ky, ~ T/a it gives a large term of order i which adds to the mass m, 
effectively banishing the ‘doubled’ state to a very high mass, far from the 
physical spectrum. 

Unfortunately there is a price to pay. The problem is that, as we learned in 
section 12.3.2, the QCD lagrangian has an exact chiral symmetry for massless 
quarks. To the extent that m, and mq (and msg, but less so) are small on 
a hadronic scale such as Ayqg, we expect chiral symmetry to have important 
physical consequences. These will indeed be explored in chapter 18. For the 
moment, we note merely that it is important for lattice-based QCD calcu- 
lations to be able to deal correctly with the light quarks. Now we cannot 
simply choose the bare Lagrangian mass parameters to be small, and leave it 
at that. In any interacting theory, renormalization effects will cause shifts in 
these masses. In a chirally symmetric theory, or one which is chirally sym- 
metric as a fermion mass goes to zero, such a mass shift is proportional to the 
fermion mass itself; in particular it does not simply add to the mass. We drew 
attention to this fact in the case of the electron mass renormalization in QED, 
in section 11.2. So in chirally symmetric theories, mass renormalizations are 
‘protected’, in this sense. But the modification (16.21), while avoiding phys- 
ical fermion doublers, breaks chiral symmetry badly. This can easily be seen 
by noting (see (12.154) for example) that the crucial property required for 
chiral symmetry to hold is 


^is P+ Pys — 0, (16.24) 
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where P is the SU(3),-covariant Dirac derivative. Any addition to J which 
is proportional to the unit 4 x 4 matrix will violate (16.24), and hence break 
chiral symmetry. The Lagrangian mass m itself is of this form, and it breaks 
chiral symmetry, but ‘softly’ — i.e. in a way that disappears as m goes to zero 
(thereby preserving the symmetry in this limit). The Wilson addition (16.21) 
also breaks chiral symmetry, but it remains there even as m — 0: it is a ‘hard’ 
breaking. 

This means that in the theory with the Wilson modification (i.e. with 
‘Wilson fermions’) fermion mass renormalization will not be protected by the 
chiral symmetry, so that large additive renormalizations are possible. This 
will require repeated fine-tunings of the bare mass parameters, to bring them 
down to the desired small values. And it turns out that this seriously lengthens 
the computing time. 

Another approach (‘staggered fermions’) was suggested by Kogut and 
Susskind (1975), Banks et al. (1976), and Susskind (1977). This essentially 
involves distributing the 4 spin degrees of freedom of the Dirac field across 
different lattice sites (we shall not need the details). At each site there is now 
a one-component fermion, with the colour degrees of freedom, which speeds 
the calculations. The 16-fold ‘doubling’ degeneracy can be re-arranged as 
four different tastes of 4-component fermions, while retaining enough chiral 
symmetry to forbid additive mass renormalizations. 

Since the different components of the staggered Dirac field now live on 
different sites, they will experience slightly different gauge field interactions. 
(These are of course local in the continuum limit, but the point remains true 
after discretization, as we shall see in the following section.) These interactions 
will mix fields of different tastes, causing new problems, but they can be 
suppressed by adding further terms to the action. There is still the 4-fold 
degeneracy to get rid of, but a trick is available for that, as we shall explain 
in section 16.3. 

One might wonder if a lattice theory with fermions could be formulated 
such that it both avoids doublers and preserves chiral symmetry. For quite 
a long time it was believed that this was not possible — a conclusion which 
was essentially the content of the Nielsen-Ninomaya theorem (Nielsen and 
Ninomaya 1981a, b, c). But more recently a way was found to formulate chiral 
gauge theories with fermions satisfactorily on the lattice at finite spacing a. 
The key is to replace the condition (16.24) by the Ginsparg-Wilson (1982) 
relation 


^s P+ Pys =a Pys P. (16.25) 


This relation implies (Lüscher 1998) that the associated action has an exact 
symmetry, with infinitesimal variations proportional to 


oy 


15 (: = ja p) Y (16.26) 


yY 


II 


Y (1 — x p) y5. (16.27) 
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The symmetry under (16.26)-(16.27), which is proportional to the infinitesi- 
mal version of (12.152) as a — 0, provides a lattice theory with all the funda- 
mental symmetry properties of continuum chiral gauge theories (Hasenfratz 
et al. 1998). Finding an operator which satisfies (16.25) is, however, not so 
easy — but that problem has now been solved, indeed in three different ways: 
Kaplan's ‘domain wall’ fermions (Kaplan 1992); ‘classically perfect fermions’ 
(Hasenfratz and Niedermayer 1994); and overlap fermions (Narayanan and 
Neuberger 1993a, b, 1994, 1995). Unfortunately all these proposals are com- 
putationally more expensive than the Wilson or staggered fermion alterna- 
tives. 


16.2.3 Gauge fields 


Having explored the discretization of actions for free scalars and Dirac fermions, 
we must now think about how to implement gauge invariance on the lattice. In 
the usual (continuum) case, we saw in chapter 13 how this was implemented by 
replacing ordinary derivatives by covariant derivatives, the geometrical signif- 

icance of which (in terms of parallel transport) is discussed in appendix N. It 

is very instructive to see how the same ideas arise naturally in the lattice case. 

We illustrate the idea in the simple case of the Abelian U(1) theory, QED. 

Consider, for example, a charged scalar field ¢(a), with charge e. To construct 

a gauge-invariant current, for example, we replaced 910,9 by $9! (8, --ieA,)ó, 

so we ask: what is the discrete analogue of this? The term ó!(z) 2o(2) 

becomes, as we have seen, 


6! ni) lom +1) — ora) (16.28) 


in one dimension. We do not expect (16.28) by itself to be gauge invariant, 
and it is easy to check that it is not. Under a gauge transformation for the 
continuous case, we have 


d6(x) 


G(x) + e* 9 o(z), A(x) > A(x) + de 


(16.29) 


then ó!(z)ó(y) transforms by 
$i (x) bly) > eH) OWI gt (x) d(y), (16.30) 


and is clearly not invariant. The essential reason is that this operator involves 
the fields at two different points, and so the term $! (n1)ó(n4 + 1) in (16.28) 
will not be gauge invariant either. The discussion in appendix N prepares us 
for this: we are trying to compare two ‘vectors’ (here, fields) at two different 
points, when the ‘coordinate axes’ are changing as we move about. We need 
to parallel transport one field to the same point as the other, before they can 
be properly compared. The solution (N.18) shows us how to do this. Consider 
the quantity : 

O(z,y) = ó!G)expe | Adz"]é(y). (16.31) 


y 
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Under the gauge transformation (16.29), O(x,y) transforms by 


O(a, y) zs ot (x)e~ie8() e f; Ada’ iel&(2)- 9 (IT eiet y) ()) = O(x, y), 
(16.32) 
and it is therefore gauge invariant. The familiar ‘covariant derivative’ rule can 
be recovered by letting y = x 4- dz for infinitesimal dx, and by considering the 
gauge-invariant quantity 


Jim, ee l (16.33) 

Evaluating (16.33) one finds (problem 16.4) the result 
p(x) (= z icA) (2) (16.34) 
= ¢'(x)D,9(x) (16.35) 


with the usual definition of the covariant derivative. In the discrete case, we 
merely keep the finite version of (16.31), and replace 9! (n1)@(n1 +1) in (16.28) 
by the gauge invariant quantity 


9! (ni)U (ni, n4 + 1)9(ni + 1), (16.36) 


where the link variable U is defined by 


nia 
U(ni,n1 + 1) = exp ie | Adz’ (16.37) 
(nı+1)a 
Note that 
U (ni, nı + 1) > exp|—ieA(nı)a] (16.38) 


in the small a limit. 7 E 
Similarly, the free Dirac term Y(nı)yPy(nı + 1) — v(ni + 1)yEv(ni) in 
(16.18) is replaced by the gauge-invariant term 


b(ny)yFU (ny, 21 + 1)i(n; +:1) — (ni +: 15PU(n, +: 1, 21) (rn). (16.39) 


The generalization to more dimensions is straightforward. In the non- 
Abelian SU(2) or SU(3) case, ‘eA’ in (16.38) is replaced by gt*A%(n1) where 
the t’s are the appropriate matrices, as in the continuum form of the covariant 
derivative. A link variable U(na,n4) may be drawn as in figure 16.1. Note 
that the order of the arguments is significant: U(n2,n1) = U~*(ni,n2) = 
Ut(n1,nz) from (16.38), which is why the link carries an arrow. 

'Thus gauge invariant discretized derivatives of charged fields can be con- 
structed. What about the Maxwell action for the U(1) gauge field? This does 
not exist in only one dimension (0,,A, — 0,A, cannot be formed), so let us 
move into two. Again, our discussion of the geometrical significance of Fj,, as 
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FIGURE 16.1 
Link variable U (ns; ni) in one dimension. 


a curvature guides us to the answer. Consider the product Un of link variables 
around a square path (figure 16.2) of side a (reading from the right): 


Uo = Ul(ng, ny; Nx, Ny+41)U (Ne, Ny+1; nai, ny4A) 


x U(nat1, Ny+1; asa, Ny)U (nz 44, Ny} Ne, Ny). (16.40) 


It is straightforward to verify, first, that Ug is gauge invariant. Under a 
gauge transformation, the link U(nz41,ny;nz,ny), for example, transforms 
by a factor (cf equation (16.32)) 


exp{ie[O(n2+1, ny) — O(na, nyl}, (16.41) 


and similarly for the three other links in Ug. In this Abelian case the expo- 
nentials contain no matrices, and the accumulated phase factors cancel out, 
verifying the gauge invariance. Next, let us see how to recover the Maxwell 
action. Adding the exponentials again, we can write 


Ug = exp{—ieaA,(nz, ny) — ieaA, (nz, ny + 1) 

+ ieaA,(nz + 1, ny) + iea A; (nz, ny) } (16.42) 

= exp ter cu Ny t 1)— Lum 
a 
+ iea? deiei \ (16.43) 
a 
OA, OA 

a ‘eq? (| 24u _ CAs 
= exp { Hea ( Da Oy )}. (16.44) 


using the derivative definition of (16.5). For small ‘a’ we may expand the 
exponential in (16.44). We also take the real part to remove the imaginary 
terms, leading to 


1 
XC- Re Un) > E eat (Fu, (16.45) 
[m] oO 

— OAy _ Az i : - 
where Fry = sz ds usual. To relate this to the continuum limit 
we must note that we sum over each such plaquette with only one definite 
orientation, so that the sum over plaquettes is equivalent to half of the entire 
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FIGURE 16.2 


A simple plaquette in two dimensions. 


sum. Thus 


Mü-ReUs) > ; ba 


[m] n1,n2 
1 
> ea J i: 1 P: dady. (16.46) 


(Note that in two dimensions ‘e’ has dimensions of mass.) In four dimensions 
similar manipulations lead to the form 


1 1 
Sp = E a — Re Un) > Z f Cearr, (16.47) 
o 


for the lattice action, as required. In the non-Abelian case, as noted above, 
‘eA’ is replaced by ‘gt - A’; for SU(3), the analogue of (16.47) is 


2 
Sg = z XO Tr(1 — Re Un), (16.48) 
o 


where the trace is over the SU(3) matrices. 


16.3 Representation of quantum amplitudes 


So (with some suitable fermionic action) we have a gauge-invariant ‘classical’ 
field theory defined on a lattice, with a suitable continuum limit. (Actually, 
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the a — 0 limit of the quantum theory is, as we shall see in section 16.5, 
more subtle than the naive replacements (16.5) because of renormalization 
issues, as should be no surprise to the reader by now). However, we have 
not yet considered how we are going to turn this classical lattice theory into 
a quantum one. The fact that the calculations are mostly going to have to 
be done numerically seems at once to require a formulation that avoids non- 
commuting operators. This is precisely what is provided by Feynman’s sum 
over paths formulation of quantum mechanics and of quantum field theory, 
and it is therefore an essential element in the lattice approach to quantum 
field theory. In this section we give a brief introduction to this formalism, 
starting with quantum mechanics. 


16.3.1 Quantum mechanics 


In section 5.2.2 we stated that in this approach the amplitude for a quantum 
system, described by a Lagrangian L depending on one degree of freedom q(t), 
to pass from a state in which q = q at t = t; to a state in which q = qf at 
time t = tf, is proportional to (with A = 1) 


slit E@@ awa), (16.49) 
all 2 q(t) = ( | : ; ) 


where q(t;) = q!, and q(tt) = qf. We shall now provide some justification for 
this assertion. 

We begin by recalling how, in ordinary quantum mechanics, state vectors 
and observables are related in the Schrödinger and Heisenberg pictures (see 
appendix I of volume 1). Let @ be the canonical coordinate operator in the 
Schrödinger picture, with an associated complete set of eigenvectors |q} such 
that 

$|q) = ala) - (16.50) 


The corresponding Heisenberg operator @y(t)is defined by 
Guu(t) = eb (to) ge iB Gto) (16.51) 


where H is the Hamiltonian, and to is the (arbitrary) time at which the two 
pictures coincide. Now define the Heisenberg picture state |q:)H by 


lau = ig) (16.52) 
We then easily obtain from (16.50)-(16.52) the result 


Qu(t)|an = q|aon , (16.53) 


which shows that |q;)g is the (Heisenberg picture) state which at time t is an 
eigenstate of qu(t) with eigenvalue q. Consider now the quantity 


CACAI (16.54) 
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which is, indeed, the amplitude for the system described by H to go from qi 
at t; to qf at tp. Using (16.52) we can write 


un(a laia = (ale ^ q); (16.55) 
we want to understand how (16.55) can be represented as (16.49). 
We shall demonstrate the connection explicitly for the special case of a 


free particle, for which 
52 
Wea (16.56) 
2m 
For this case, we can evaluate (16.55) directly as follows. Inserting a complete 


set of momentum eigenstates, we obtain! 


‘) = oe (f [p) (ple to 


1 oo 
= Lore —ip?(t;— t)/2me —ipq' dp 


—iH(t;—ti) iy dp 


(q']e 


= a "els IE ee -p - d) bap. 


To evaluate the integral, we complete the square via the steps 


p? (te — ti) 


Lp og) = (E26) p 2mp a) 
T p(q—4) = ( z | |P ETE 
_ (5-8 m-i] md - gy 
= 2m p te — ti (tc — ti)? 
t-t\ 2 mld- i)? 
= pee a e 16. 
( 2m ) d 2(te = ti) , ( 58) 
where , f 
pP =p- e (16.59) 
f> ti 


We then shift the integration variable in (16.57) to p', and obtain 


1 m(g! — q')? / ifte — tp” 
— ———————T d a — _ ———— į 
) ups i 2(te — ti) S ies 2m 


(16.60) 
As it stands, the integral in (16.60) is not well-defined, being rapidly oscillatory 
for large p'. However, it is at this point that the motivation for passing to 


—iA( (tg— ti) 


(afle 


1Remember that (q|p) is the q-space wavefunction of a state with definite momentum p, 
and is therefore a plane wave; we are using the normalization of equation (E.26) in volume 1. 
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‘Euclidean’ space-time arises. If we make the replacement t — —ir, (16.60) 


becomes 
: 1 f. Aiy2 oo -7 12 
d) = gcexp X q’) | f dU ep - (rt — 7i)p | 
7 : 


and the integral is a simple convergent Gaussian. Using the result 


rb dée~ = "B (16.62) 


—É (te=) 


(dle 


we finally obtain 


—H (r;—m1) 


(16.63) 


(fi ee), 


WE - 2(rr — T) 


iy c m 
dis 2n(Tr — Ti) 

We must now understand how the result (16.63) can be represented in the 
form (16.49). In Euclidean space, (16.49) is 


umi dq 2 
e — = —|d 16.64 
> on ( [ 2G) r) (16.64) 
paths i 

in the free-particle case. We interpret the 7 integral in terms of a discretization 
procedure, similar to that introduced in section 16.2. We split the interval 


Te — 7; into N segments each of size e, as shown in figure 16.3. The 7-integral 
in (16.64) becomes the sum 


N j j-12 
qi = g 
mM: qvo (16.65) 


and the ‘sum over paths’, in going from q? = qf at 7; to q" = qf at 7, is 
now interpreted as a multiple integral over all the intermediate positions 


q1, q?,..., q^-! which paths can pass through at ‘times’ 71,79,..., TN 1: 
1 (q? aight) dg dg? da 
—— aan — | ————..—— 16.66 
A i J l = ux 2e aad ag 999 


where A(e) is a normalizing factor, depending on e, which is to be determined. 

The integrals in (16.66) are all of Gaussian form, and since the integral 
of a Gaussian is again a Gaussian (cf the manipulations leading from (16.57) 
to (16.60), but without the ‘i’ in the exponents), we may perform all the 
integrations analytically. We follow the method of Feynman and Hibbs (1965), 
section 3.1. Consider the integral over q!: 


Da [ex Cz [a^ — 4?  (q' - dy]) dq’. (16.67) 
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FIGURE 16.3 
A ‘path’ from q? = q! at T; to q = qf at 7, via the intermediate positions 
q'yg?, bas TO at 71, 72,..., TN—1- 


'This can be evaluated by completing the square, shifting the integration vari- 
able, and using (16.62), to obtain (problem 16.5) 


i — + 
p= (=) * exp [Ze — Zu : (16.68) 
m 4e 
Now the procedure may be repeated for the q? integral 
P= few{-2@-a?-2a@-eF} ae, (16.69) 
4e 2€ 
which yields (problem 16.5) 
1 
4ne V? —m : 
P = | — —(q? — q)?| E 16. 
(=) exp | (q 2 (16.70) 


As far as the exponential factors in (16.63) in (16.64) are concerned, the 
pattern is now clear: after n — 1 steps we shall have an exponential factor 


exp [—m(q” — q')?/(2ne)] . (16.71) 
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Hence, after N — 1 steps we shall have a factor 
exp [—m(q' — à? /2(n = n)] , (16.72) 


remembering that q = qf and that Tf — 75 = Ne. So we have recovered the 
correct exponential factor of (16.63), and all that remains is to choose A(e) in 
(16.66) so as to produce the same normalization as (16.63). 


The required A(e) is 
| 27€ 


as we now verify. For the first (q!) integration, the formula (16.66) contains 
two factors of A^ 1(c), so that the result (16.68) becomes 


amu er ee 
E (zu) em Eo - 4»). (16.74) 


For the second (g?) integration, the accumulated constant factor is 


EIC Ga (m) à ir i (16.75) 


Proceeding in this way, one can convince oneself that after N — 1 steps, the 
accumulated constant is 


(e = |’ , (16.76) 


as in (16.63). 

The equivalence of (16.63) and (16.64) (in the sense e — 0) is therefore 
established for the free-particle case. More general cases are discussed in 
Feynman and Hibbs (1965) chapter 5, and in Peskin and Schroeder (1995) 
chapter 9. The conventional notation for the path-integral amplitude is 


— É (te= ri) 


i — fE Ldr 
(ale dy = f Dare Ee, (16.77) 
where the right-hand side of (16.77) is interpreted in the sense of (16.66). 
We now proceed to discuss further aspects of the path-integral formula- 
tion. Consider the (Euclideanized) amplitude (q'|e (7-7) |q') and insert a 
complete set of energy eigenstates |n) such that H|n) = E, |n): 


d) = Y (dí In)(n|g)e 70770 (16.78) 


TL 


— É (te= ri) 


(q']e 


Equation (16.78) shows that if we take the limits 7; — —oo, Tẹ + oo, then the 
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state of lowest energy Eo (the ground state) provides the dominant contribu- 
tion. Thus, in this limit, our amplitude will represent the process in which 
the system begins in its ground state |Q) at n + —oo, with q = qt, and ends 
in |Q) at Te — oo, with q = qf. 

How do we represent propagators in this formalism? Consider the expres- 
sion (somewhat analogous to a field theory propagator) 


Gi (ta, ts) = (at, [T Fan (ta Ân (65)) ai.) > (16.79) 


where T' is the usual time-ordering operator. Using (16.51) and (16.52), 
(16.79) can be written, for ty > ta, as 


Galta, ty) = (qf |o iB (tet) geri teta) ge iB (tats) [oi , (16.80) 


Inserting a complete set of states and Euclideanizing, (16.80) becomes 


Gia = I dg" dg q^ q^q!Je- P =q") 


x (qe B C-72|g*)(g^]e-H C77] . (16.81) 


Now, each of the three matrix elements has a discretized representation of the 
form (16.63), with say N; — 1 variables in the interval (Ta, Ti), N2 — 1 in (75, Ta) 
and N3 — 1 in (75,75). Each such representation carries one ‘surplus’ factor 
of [A(e)]~', making an overall factor of [A(c)] ?. Two of these factors can be 
associated with the dg^dg^ integration in (16.81), so that we have a total of 
N, + N2 + N3 — 1 properly normalized integrations, and one ‘surplus’ factor 
[A(ce)]-! as in (16.66). If we now identify q(74) = q^, q(tm) = q^, it follows 
that (16.81) is simply 


Í Da(r)q(ra)a(r)e ^n ^7 . (16.82) 


In obtaining (16.82), we took the case ™ > Ta. Suppose alternatively that 
Ta > Ty. Then the order of 7, and 7, inside the interval (7;,7¢) is simply 
reversed, but since q^ and q^ in (16.81), or q(Ta) and q(t) in (16.82), are 
ordinary (commuting) numbers, the formula (16.82) is unaltered, and actually 
does represent the matrix element (16.79) of the time-ordered product. 


16.3.2 Quantum field theory 


The generalizations of these results to the field theory case are intuitively 
clear. For example, in the case of a single scalar field ¢(x), we expect the 
analogue of (16.82) to be (cf (16.4)) 


n Dé(z) d(z«)d(zo) jap |- E Lalo, V, 0d) dax | , (16.83) 
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where 


dirg = d?zdr, (16.84) 


and the boundary conditions are given by ó(x, n) = ¢# (£), ¢(x, Tt) = (2), 
ó(x, Ta) = é* (x) and é(x, T) = ó^(x), say. In (16.83), we have to understand 
that a four-dimensional discretization of Euclidean space-time is implied, the 
fields being Fourier-analyzed by four-dimensional generalizations of expres- 
sions such as (16.7). Just as in (16.79)-(16.82), (16.83) is equal to 


(e G)le- "T {bu (wa)du(ae) ) eH [oi (z)) . (16.85) 


Taking the limits 7; — —oo, 7; — oo will project out the configuration of 
lowest energy, as discussed after (16.78), which in this case is the (interacting) 
vacuum state |Q). Thus in this limit the surviving part of (16.85) is 


(9! (x) Me Po" (atr {bulra ule) } Aa) (16.86) 


with T — oo. The exponential and overlap factors can be removed by dividing 
by the same quantity as (16.85) but without the additional fields $(x,) and 
(ap). In this way, we obtain the formula for the field theory propagator in 
four-dimensional Euclidean space: 


A 2 . J Do (xa) o(as)exp[— f7, £gd vg] 

(OT {bu(va)ou(as)} |Q) = lim. da E T. Lede] 

(16.87) 
Vacuum expectation values of time-ordered products of more fields will simply 
have more factors of o on both sides. 

Perturbation theory can be developed in this formalism also. Suppose 
Lg = £O --£CX*, where LS, describes a free scalar field and £1 is an interaction, 
for example \¢*. Then, assuming A is small, the exponential in (16.87) can 
be expressed as 


exp EE (L9 +e) = (cr [ater a) (: - A f ans? +...) 

(16.88) 
and both numerator and denominator of (16.87) may be expressed as vevs of 
products of free fields. Compact techniques exist for analyzing this formula- 
tion of perturbation theory (Ryder 1985, chapter 6, Peskin & Schroeder 1995, 
chapter 9), and one finds exactly the same ‘Feynman rules’ as in the canonical 
(operator) approach. 

In the case of gauge theories, we can easily imagine a formula similar to 
(16.87) for the gauge field propagator, in which the integral is carried out over 
all gauge fields A,,(x) (in the U(1) case, for example). But we already know 
from chapter 7 (or from chapter 13 in the non-Abelian case) that we shall 
not be able to construct a well-defined perturbation theory in this way, since 
the gauge field propagator will not exist unless we ‘fix the gauge’ by imposing 
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some constraint, such as the Lorentz gauge condition. Such constraints can 
be imposed on the corresponding path integral, and indeed this was the route 
followed by Faddeev and Popov (1967) in first obtaining the Feynman rules 
for non-Abelian gauge theories, as mentioned in section 13.5.3. 

In the discrete case, the appropriate integration variables are the link vari- 
ables U(l;) where l; is the itè link. They are elements of the relevant gauge 
group - for example U(n1,n1 + 1) of (16.3.1) is an element of U(1). In the 
case of the unitary groups, such elements typically have the form (cf (12.35)) 
~ exp(i Hermitean matrix), where the ‘Hermitean matrix’ can be parametrized 
in some convenient way — for example, as in (12.31) for SU(2). In all these 
cases, the variables in the parametrization of U vary over some bounded do- 
main (they are essentially ‘angle-type’ variables, as in the simple U(1) case), 
and so, with a finite number of lattice points, the integral over the link vari- 
ables is well-defined without gauge-fixing. The integration measure for the 
link variables can be chosen so as to be gauge invariant, and hence provided 
the action is gauge invariant, the formalism provides well-defined expressions, 
independently of perturbation theory, for vevs of gauge invariant quantities. 

There remains one more conceptual problem to be addressed in this ap- 
proach: namely, how are we to deal with fermions? It seems that we must 
introduce new variables which, though not quantum field operators, must 
nevertheless anticommute with each other. Such ‘classical’ anticommuting 
variables are called Grassmann variables, and are briefly described in ap- 
pendix P. Further details are contained in Ryder (1985) and in Peskin and 
Schroeder (1995) section 9.5). For our purposes, the important point is that 
the fermion Lagrangian is bilinear in the (Grassmann) fermion fields v, the 
fermionic action for one flavour having the form 


Sy, = [ secutos, (16.89) 


where Mt is a matrix representing the Dirac operator i [2 — m; in its discretized 
and Euclideanized form. This means that in a typical fermionic amplitude of 
the form (cf the denominator of (16.87)) 


Zo, = [ DED expla], (16.90) 


one has essentially an integral of Gaussian type (albeit with Grassmann vari- 
ables), which can actually be performed analytically?. The result is simply 
det [M;(U)], the determinant of the Dirac operator matrix. For Np flavours, 
this easily generalizes to 


] aec). (16.91) 


?See appendix P. 
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Now we may write 


Ne 
| | detM:(U) = exp b IndetM;(U)| , (16.92) 
f=1 f 
so that the effect of N¢ fermions is to contribute an additional term 
Sea (U) = — 3 In det[M;(U)] (16.93) 
f 


to the gluonic action. But although formally correct, this fermionic contribu- 
tion is computationally very time-consuming to include. Until the mid-1990s 
it could not be done, and instead calculations were made using the quenched 
approximation, in which the determinant is set equal to a constant indepen- 
dent of the link variables U. This is equivalent to the neglect of closed fermion 
loops in a Feynman graph approach, i.e. no vacuum polarization insertions 
on virtual gluon lines. Vacuum polarization amplitudes typically behave as 
q? /m? for q? < m?, where q is the momentum flowing into the loop (see equa- 
tion (11.39), for example, in the case of QED). The quenched approximation 
is therefore poorer for the light quarks u, d and s. 

By the later 1990s it was possible to include the determinant provided the 
quark masses were not too small: the computation slowed down seriously for 
light quark masses. So calculations were done for unphysically large values of 
Mu, Ma and ms, and the results extrapolated towards the physical values. 

Beginning in the early 2000s, however, more precise calculations with sub- 
stantially lighter quark masses became possible, using the staggered fermion 
formulation discussed in section 16.2.2. It will be recalled that this saves a 
factor of four in the number of degrees of freedom. But there is still the re- 
maining problem of the four unwanted additional ‘tastes’. If these tastes are 
degenerate, as they would be in the continuum limit, then we can use the 
simple trick of replacing Seg(U) by 4Ser(U), which means that we take the 
fourth root of the staggered fermion determinant. The true physical (non- 
degenerate) quark flavour multiplicity still remains, of course, and we arrive 
at 


Soff stag. — —]n det{Mstag. u(U) Mstag. a(U) Mstag. (Care (16.94) 


Unfortunately, things are not so simple away from the continuum limit, at 
finite lattice spacing a. Bernard, Golterman and Shamir (2006) pointed out 
that the quantity 

{det Mstag. (U)} "4 (16.95) 


cannot be represented by a local single-taste theory except in the continuum 
limit: at finite a, it represents a non-local single-taste action. Locality is a 
very fundamental property of all successful quantum field theories, and its 
recovery from (16.95) in the limit a > 0 is not obvious. We refer to Sharpe 
(2006) for a full discussion, and further references. Meanwhile, as we shall see 
in section 16.6, some of the currently (in 2011) most accurate published results 
in lattice QCD are using staggered fermions with the ‘rooting’ procedure. 
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16.3.3 Connection with statistical mechanics 


Not the least advantage of the path integral formulation of quantum field 
theory (especially in its lattice form) is that it enables a highly suggestive 
connection to be set up between quantum field theory and statistical me- 
chanics. We introduce this connection now, by way of a preliminary to the 
discussion of renormalization in the following section. 

'The connection is made via the fundamental quantity of equilibrium sta- 
tistical mechanics, the partition function Z defined by 


Z= M exp (E) (16.96) 


configurations 


which is simply the ‘sum over states’ (or configurations) of the relevant de- 
grees of freedom, with the Boltzmann weighting factor. H is the classical 
Hamiltonian evaluated for each configuration. Consider, for comparison, the 
denominator in (16.87), namely 


Zo = nz exp(—Sg), (16.97) 
where 
Sg = n = n E + (Voy + Img + roth (16.98) 


in the case of a single scalar field with mass m and self-interaction A¢*+. The 
Euclideanized Lagrangian density £g is like an energy density: it is bounded 
from below, and increases when the field has large magnitude or has large 
gradients in 7 or a. The factor exp(— Sg) is then a sensible statistical weight 
for the fluctuations in 9, and Z may be interpreted as the partition function 
for a system described by the field degree of freedom 6, but of course in four 
‘spatial’ dimensions. 

'The parallel becomes perhaps even stronger when we discretize space-time. 
In an Ising model (see the following section), the Hamiltonian has the form 


Hao) diners (16.99) 


where J is a constant, and the sum is over lattice sites n, the system variables 
taking the values +1. When (16.99) is inserted into (16.96), we arrive at 
something very reminiscent of the ¢(n1)¢(n1 + 1) term in (16.6). Naturally, 
the effective ‘Hamiltonian’ is not quite the same — though we may note that 
Wilson (1971b) argued that in the case of a ¢* interaction the parameters can 
be chosen so as to make the values ¢ = +1 the most heavily weighted in Sg. 
Statistical mechanics does, of course, deal in three spatial dimensions, not 
the four of our Euclideanized space-time. Nevertheless, it is remarkable that 


172 16. Lattice Field Theory, and the Renormalization Group Revisited 


quantum field theory in three spatial dimensions appears to have such a close 
relationship to equilibrium statistical mechanics in four spatial dimensions. 

One insight we may draw from this connection is that, in the case of pure 
gauge actions (16.47) or (16.48), the gauge coupling is seen to be analogous 
to an inverse temperature, by comparison with (16.96). One is led to wonder 
whether something like transitions between different ‘phases’ exist, as coupling 
constants (or other parameters) vary — and, indeed, such changes of ‘phase’ 
can occur. 

A second point is somewhat related to this. In statistical mechanics, an 
important quantity is the correlation length €, which for a spin system may 
be defined via the spin-spin correlation function 


G(æ) = (s(a)s(0)) = $) s(a)s(0)e 7/7 , (16.100) 


all s(£) 


where we are once more reverting to a continuous æ variable. For large |æ], 


this takes the form i lal 
—|ax 

G(x) x — exp (=) ; (16.101) 
|x| &(T) 


The Fourier transform of this (in the continuum limit) is 


G(k?) œ (k? -&?(T)) , (16.102) 
as we learned in section 1.3.3. Comparing (16.100) with (16.87), it is clear 
that (16.100) is proportional to the propagator (or Green function) for the field 
s(x); (16.102) then shows that €~1(T) is playing the role of a mass term m. 
Now, near a critical point for a statistical system, correlations exist over very 
large scales € compared to the inter-atomic spacing a; in fact, at the critical 
point €(T.) ~ L, where L is the size of the system. In the quantum field 
theory, as indicated earlier, we may regard a! as playing a role analogous to 
a momentum cut-off A, so the regime € >> a is equivalent to m « A, as was 
indeed always our assumption. Thus studying a quantum field theory this way 
is analogous to studying a four-dimensional statistical system near a critical 
point. This shows rather clearly why it is not going to be easy: correlations 
over all scales will have to be included. At this point, we are naturally led to 
the consideration of renormalization in the lattice formulation. 


[i EOOOREEEEEEEILLaLDLaEEEREE 


16.4 Renormalization, and the renormalization group, 
on the lattice 


16.4.1 Introduction 


In the continuum formulation which we have used elsewhere in this book, 
fluctuations over short distances of order A^! generally lead to divergences 
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in the limit A — oo, which are controlled (in a renormalizable theory) by 
the procedure of renormalization. Such divergent fluctuations turn out, in 
fact, to affect a renormalizable theory only through the values of some of 
its parameters and, if these parameters are taken from experiment, all other 
quantities become finite, even as A — oo. This latter assertion is not easy 
to prove, and indeed is quite surprising. However, this is by no means all 
there is to renormalization theory: we have seen the power of ‘renormal- 
ization group! ideas in making testable predictions for QCD. Nevertheless, 
the methods of chapter 15 were rather formal, and the reader may well 
feel the need of a more physical picture of what is going on. Such a pic- 
ture was provided by Wilson (1971a) (see also Wilson and Kogut 1974), us- 
ing the ‘lattice + path integral approach. Another important advantage 
of this formalism is, therefore, precisely the way in which, thanks to Wil- 
son's work, it provides access to a more intuitive way of understanding renor- 
malization theory. The aim of this section is to give a brief introduction 
to Wilson's ideas, so as to illuminate the formal treatment of the previous 
chapter. 


In the ‘lattice + path integral’ approach to quantum field theory, the 
degrees of freedom involved are the values of the field(s) at each lattice site, 
as we have seen. Quantum amplitudes are formed by integrating suitable 
quantities over all values of these degrees of freedom, as in (16.87) for example. 
From this point of view, it should be possible to examine specifically how the 
‘short distance’ or ‘high momentum’ degrees of freedom affect the result. In 
fact, the idea suggests itself that we might be able to perform explicitly the 
integration (or summation) over those degrees of freedom located near the 
cutoff A in momentum space, or separated by only a lattice site or two in 
co-ordinate space. If we can do this, the result may be compared with the 
theory as originally formulated, to see how this ‘integration over short-distance 
degrees of freedom' affects the physical predictions of the theory. Having done 
this once, we can imagine doing it again — and indeed iterating the process, 
until eventually we arrive at some kind of ‘effective theory’ describing physics 
in terms of ‘long-distance’ degrees of freedom. 


There are several aspects of such a programme which invite comment. 
First, the process of ‘integrating out’ short-distance degrees of freedom will 
obviously reduce the number of effective degrees of freedom, which is neces- 
sarily very large in the case € > a, as envisaged above. Thus it must be 
a step in the right direction. Secondly, the above sketch of the ‘integrating 
out’ procedure suggests that, at any given stage of the integration, we shall 
be considering the system as described by parameters (including masses and 
couplings) appropriate to that scale, which is of course strongly reminiscent 
of RGE ideas. And thirdly, we may perhaps anticipate that the result of 
this ‘integrating out’ will be not only to render the parameters of the theory 
scale-dependent, but also, in general, to introduce new kinds of effective in- 
teractions into the theory. We now consider some simple examples which we 
hope will illustrate these points. 
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FIGURE 16.4 
A portion of the one-dimensional lattice of spins in the Ising model. 


16.4.2. Two one-dimensional examples 


Consider first a simple one-dimensional Ising model with Hamiltonian (16.99) 
and partition function 


N-1 
Z=ŅŇ exp E »» = , (16.103) 
{sn} n=0 


where K = J/(kgT) > 0. In (16.103) all the s, variables take the values 
+1 and the ‘sum over {s,,}’ means that all possible configurations of the N 
variables 59, 51, 52,..., SN 1 are to be included. The spin sn is located at the 
lattice site na, and we shall (implicitly) be assuming the periodic boundary 
condition Sn = sw,4. Figure 16.4 shows a portion of the one-dimensional 
lattice with the spins on the sites, each site being separated by the lattice 
constant a. Thus, for the portion {sy-—1, $0, ... s4} we are evaluating 


5 exp|K(sN-150 + 8981 + 8182 + S283 + 8384)] x (16.104) 


SN—1,;50,81,52,83,84 


Now suppose we want to describe the system in terms of a ‘coarser’ lattice, 
with lattice spacing 2a, and corresponding new spin variables s/,. There are 
many ways we could choose to describe the s/,, but here we shall only consider 
a very simple one (Kadanoff 1977) in which each s’, is simply identified with 
the sn at the corresponding site (see figure 16.5). For the portion of the lattice 
under consideration, then, (16.104) becomes 


5 exp [K (sn—189 + $981 + $15, + 5153 + $385)] . — (16.105) 


^h + rd 
SN—15895$1,81,93,95 


If we can now perform the sums over s; and s3 in (16.105), we shall end up 
(for this portion) with an expression involving the ‘effective’ spin variables 
50,8, and sh, situated twice as far apart as the original ones, and therefore 
providing a more ‘coarse grained’ description of the system. Summing over sı 
and s3 corresponds to ‘integrating out’ two short-distance degrees of freedom 
as discussed earlier. 

In fact, these sums are easy to do. Consider the quantity exp(K s551); 
expanded as a power series: 

2 K3 


K 
exp(Ks951) = 1 + Ksosi + T + ar (8081) +... (16.106) 
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5N-1 So 51 $5 ES S4 Ss 


FIGURE 16.5 

A ‘coarsening’ transformation applied to the lattice portion shown in figure 
16.4. The new (primed) spin variables are situated twice as far apart as the 
original (unprimed) ones. 


where we have used (ssi)? = 1. It follows that 


exp( s551) = cosh K (1 + sos; tanh K), (16.107) 
0 0 


and similarly 
exp(Ks151) = cosh K (1 + 51s tanh K). (16.108) 


Thus the sum over s, is 


Y cosh’ K (1+ ss; tanh K + sis, tanh K + sos; tanh” K). — (16.109) 


sy=t1 


Clearly, the terms linear in sı vanish after summing, and the sı sum becomes 
just 
2 cosh? K (1+ ss, tanh? K) . (16.110) 


Remarkably, (16.110) contains a new ‘nearest-neighbour’ interaction, sos, 
just like the original one in (16.103), but with an altered coupling (and a dif- 
ferent spin-independent piece). In fact, we can write (16.110) in the standard 
form 

exp [gi(K) + K'sps'] (16.111) 


and then use (16.107) to set 


tanh K' = tanh? K (16.112) 
and identify 
2 cosh? K 
K) = ln | ——— 16.113 
ai) n( cosh K’ ) ( ) 


Exactly the same steps can be followed through for the sum on s3 in (16.105), 
and indeed for all the sums over the ‘integrated out’ spins. The upshot is 
that, apart from the accumulated spin-independent part, the new partition 
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function, defined on a lattice of size 2a, has the same form as the old one, but 
with a new coupling K’ related to the old one K by (16.112). 

Equation (16.112) is an example of a renormalization transformation: the 
number of degrees of freedom has been halved, the lattice spacing has doubled, 
and the coupling K has been renormalized to K". 

It is clear that we could apply the same procedure to the new Hamiltonian, 
introducing a coupling K” which is related to K’ , and thence to K by 


tanh K” = (tanh K^)? = (tanh K)*. (16.114) 


This is equivalent to iterating the renormalization transformation; after n 
iterations, the effective lattice constant is 2"a, and the effective coupling is 
given by 

tanh K™ = (tanh K)”. (16.115) 


The successive values K', K",... of the coupling under these iterations can 
be regarded as a ‘flow’ in the (one-dimensional) space of K-values: a renor- 
malization flow. 

Of particular interest is a point (or points) K* such that 


tanh K* = tanh? K*. (16.116) 


This is called a fired point of the renormalization tranformation. At such a 
point in K-space, changing the scale by a factor of 2 (or 2” for that matter) 
will make no difference, which means that the system must be in some sense 
ordered. Remembering that K = J/(kpgT), we see that K = K* when the 
temperature is ‘tuned’ to the value T = T* = J/(kgK*). Such a T* would 
be the temperature of a critical point for the thermodynamics of the system, 
corresponding to the onset of ordering. In the present case, the only fixed 
points are K* = oo and K* — 0. Thus there is no critical point at a non-zero 
T*, and hence no transition to an ordered phase. However, we may describe 
the behaviour as T — 0 as ‘quasi-critical’. For large K, we may use 


tanh K ~ 1 — 2e ?K (16.117) 


to write (16.115) as 
1 
KM =K- zmn, (16.118) 


which shows that K” changes only very slowly (logarithmically) under itera- 
tions when in the vicinity of a very large value of K, so that this is ‘almost’ 
a fixed point. 

We may represent the flow of K, under the renormalization transformation 
(16.115), as in figure 16.6. Note that the flow is away from the quasi-fixed 
point at K* = oo (T = 0) and towards the (non-interacting) fixed point at 
K* —Q0. 

A renormalization transformation which has a fixed point at a finite (nei- 
ther zero nor infinite) value of the coupling is clearly of greater interest, since 
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FIGURE 16.6 
‘Renormalization flow’: the arrows show the direction of flow of the coupling 
K as the lattice constant is increased. The starred values are fixed points. 


FIGURE 16.7 
The renormalization flow for the transformation (16.120). 


this will correspond to a critical point at a finite temperature. A simple such 
example given by Kadanoff (1977) is the transformation 
1 


K' = z0 (16.119) 


for a doubling of the effective lattice size, or 
1 
Ko = 3K)" (16.120) 


for n such iterations. The model leading to (16.120) involves fermions in one 
dimension, but the details are irrelevant to our purpose here. The renormal- 
ization transformation (16.120) has three fixed points: K* = 0, K* = oo and 
the finite point K* = i. The renormalization flow is shown in figure 16.7. 

The striking feature of this flow is that the motion is always away from 
the finite fixed point, under successive iterations. This may be understood by 
recalling that at the fixed point (which is a critical point for the statistical 
system) the correlation length £ must be infinite (as L — oo). As we iterate 
away from this point, € decreases and we leave the fixed (or critical) point. 
For this model, £ is given by Kadanoff (1977) as 


a 
= Tok] 


(16.121) 


which indeed goes to infinity at K = i. 


16.4.3 Connections with particle physics 


Let us now begin to think about how all this may relate to the treatment of 
the renormalization group in particle physics, as given in the previous chapter. 
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FIGURE 16.8 
The 8-function of (16.124); the arrows indicate increasing f. 


First, we need to consider a continuous change of scale, say by a factor of f. 
In the present model, the transformation (16.120) then becomes 


K(fa) — TEKO). (16.122) 
Differentiating (16.122) with respect to f, we find 
AT = K(fa)ln 2K(fa)]. (16.123) 


We may reasonably call (16.123) a renormalization group equation, describing 
the ‘running’ of K(fa) with the scale f, analogous to the RGE's for a and as 
considered in chapter 15. In this case, the 6-function is 


B(K) = K In(2K), (16.124) 


which is sketched in figure 16.8. The zero of 8 is indeed at the fixed (critical) 
point K = i. and this is an infrared unstable fixed point, the flow being away 
from it as f increases. 

'The foregoing is exactly analogous to the discussion in section 15.5: see in 
particular figure 15.6 and the related discussion. Note, however, that in the 
present case we are considering rescalings in position space, not momentum 
space. Since momenta are measured in units of a^, it is clear that scaling 
a by f is the same as scaling k by f^! = t, say. This will produce a change 
in sign in dK/dt relative to dK/df, and accounts for the fact that K = 3 is 
an infrared unstable fixed point in figure 16.8, while oz is an infrared stable 
fixed point in figure 15.6(b). Allowing for the change in sign, figure 16.8 is 
quite analogous to figure 15.6(a). 
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We have emphasized that, at a critical point, and in the continuum limit, 
the correlation length € — oo, or equivalently the mass parameter (cf (16.102)) 
m = €~! — 0. In this case, the Fourier transform of the spin-spin correlation 
function should behave as i 

G(k?) x E (16.125) 
This is indeed the k?-dependence of the propagator of a free, massless scalar 
particle, but — as we learned for the fermion propagator in section 15.5 — it 
is no longer true in an interacting theory. In the interacting case, (16.125) 
generally becomes modified to 


~ 1 
2 — 
G(k?) x iF (16.126) 
or equivalently 
1 


|x| 


in three spatial dimensions, and in the continuum limit. Thus, at a critical 
point, the spin-spin correlation function exhibits scaling under the transforma- 
tion x’ = fx, but it is not free-field scaling. Comparing (16.126) with (15.75), 
we see that 5/2 is precisely the anomalous dimension of the field s(x), so — 
just as in section 15.5 — we have an example of scaling with anomalous di- 
mension. In the statistical mechanics case, 7 is a critical exponent, one of a 
number of such quantities characterizing the critical behaviour of a system. 
In general, 7 will depend on the coupling constant n(K): at a non-trivial 
fixed point, 7 will be evaluated at the fixed point value K*, n(K*). Enormous 
progress was made in the theory of critical phenomena when the powerful 
methods of quantum field theory were applied to calculate critical exponents 
(see for example Peskin & Schroeder 1995, chapter 13, and Binney et al. 
1992). 

In our discussion so far, we have only considered simple models with just 
one ‘coupling constant’, so that diagrams of renormalization flow were one- 
dimensional. Generally, of course, Hamiltonians will consist of several terms, 
and the behaviour of all their coefficients will need to be considered under a 
renormalization transformation. The general analysis of renormalization flow 
in multi-dimensional coupling space was given by Wegner (1972). In simple 
terms, the coefficients show one of three types of behaviour under renormal- 
ization transformations such that a — fa, characterized by their behaviour in 
the vicinity of a fixed point: (1) the difference from the fixed point value grows 
as f increases, so that the system moves away from the fixed point (as in the 
single-coupling examples considered earlier); (ii) the difference decreases as f 
increases, so the system moves towards the fixed point; (iii) there is no change 
in the value of the coupling as f changes. The corresponding coefficients are 
called, respectively, (i) relevant, (ii) irrelevant and (iii) marginal couplings; the 
terminology is also frequently applied to the operators in the Hamiltonians 
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themselves. The intuitive meaning of ‘irrelevant’ is clear enough: the system 
will head towards a fixed point as f — oo whatever the initial values of the 
irrelevant couplings. The critical behaviour of the system will therefore be 
independent of the number and type of all irrelevant couplings, and will be 
determined by the relatively few (in general) marginal and relevant couplings. 
'Thus all systems which flow close to the fixed point will display the same 
critical exponents determined by the dynamics of these few couplings. This 
explains the property of universality observed in the physics of phase transi- 
tions, whereby many apparently quite different physical systems are described 
(in the vicinity of their critical points) by the same critical exponents. 

Additional terms in the Hamiltonian are, in fact, generally introduced 
following a renormalization transformation. In the quantum field case, we 
may expect that renormalization transformations associated with a — fa, 
and iterations thereof, will in general lead to an effective theory involving all 
possible couplings allowed by whatever symmetries are assumed to be relevant. 
Thus, if we start with a typical ‘¢*’ scalar theory as given by (16.98), we 
shall expect to generate all possible couplings involving ¢ and its derivatives. 
At first sight, this may seem disturbing: after all, the original theory (in 
four dimensions) is a renormalizable one, but an interaction such as AQ is 
not renormalizable according to the criterion given in section 11.8 (in four 
dimensions $ has mass dimension unity, so that A must have mass dimension 
-2). It is, however, essential to remember that in this ‘Wilsonian’ approach to 
renormalization, summations over momenta appearing in loops do not, after 
one iteration a — fa, run up to the original cut-off value 7/a, but only up 
to the lower cut-off 7/fa. The additional interactions compensate for this 
change. 

In fact, we shall now see how the coefficients of non-renormalizable inter- 
actions correspond precisely to irrelevant couplings in Wilson's approach, so 
that their effect becomes negligible as we iterate to scales much larger than 
a. We consider continuous changes of scale characterized by a factor f, and 
we discuss a theory with only a single scalar field $ for simplicity. Imagine, 
therefore, that we have integrated out, in (16.97), those components of ¢(x) 
with a « |x| « fa. We will be left with a functional integral of the form 
(16.97), but with ¢(x) restricted to |x| > fa, and with additional interaction 
terms in the action. In order to interpret the result in Wilson's terms, we 
must rewrite it so that it has the same general form as the original Z, of 
(16.97). A simple way to do this is to rescale distances by 


(16.128) 


so that the functional integral is now over ó(x') with |x’| > a, as in (16.97). 
We now define the fixed point of the renormalization transformation to be 
that in which all the terms in the action are zero, except the ‘kinetic’ piece; 
this is the ‘free-field’ fixed point. Thus, we require the kinetic action to be 
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unchanged: 
n Cor = no 
1 
= pid ee (0,9 (16.129) 
from which it follows that ¢’ = f$. Consider now a term of the form A9: 
4 6 A 4,7 416 


(16.130) shows that the ‘new’ A’ is related to the old one by A’ = $5 and 
in particular that, as f increases, A’ decreases and is therefore an irrelevant 
coupling, tending to zero as we reach large scales. But such an interaction 
is precisely a non-renormalizable one (in four dimensions), according to the 
criterion of section 11.8. The mass dimension of $ is unity, and hence that 
of A must be -2 so that the action is dimensionless; couplings with negative 
mass dimensions correspond to non-renormalizable interactions. The reader 
may verify the generality of this result for any interaction with p powers of ¢, 
and q derivatives of ¢. 
However, the mass term m?¢? behaves differently: 


m? [dnx v= mp? [ata p? (16.131) 


showing that m’? = m? f? and the ‘coupling’ m? is relevant, since it grows 


with f?. Such a term has positive mass dimension, and corresponds to a 
‘super-renormalizable’ interaction. Finally, the \¢* interaction transforms as 


/ d!zg ¢* =X / dizh gf (16.132) 


and so A = A. The coupling is marginal, which may correspond (though 
not necessarily) to a renormalizable interaction. To find out if such couplings 
increase or decrease with f, we have to include higher-order loop corrections. 
The foregoing analysis in terms of the suppression of non-renormalizable in- 
teractions by powers of f—! parallels precisely the similar one in section 11.8. 
We saw that such terms were suppressed at low energies by factors of E/^, 
where A is the cut-off scale beyond which the theory is supposed to fail on 
physical grounds (e.g. A might be the Planck mass). The result is that as 
we renormalize, in Wilson's sense, down to much lower energy scales, the 
non-renormalizable terms disappear and we are left with an effective renor- 
malizable theory. This is the field theory analogue of ‘universality’. 

'These ideas have an important application in lattice QCD. One of the 
reasons for systematic inaccuracies in lattice computations is that the contin- 
uum is being simulated by a lattice of finite spacing. Symanzik (1983) showed 
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that corrections to continuum theory results stemming from finite lattice spac- 
ing could be diminished systematically by the use of lattice actions that also 
include suitable irrelevant terms. This procedure is routinely adopted in ac- 
curate lattice calculations with ‘Symanzik-improved’ actions. 

One further word should be said about terms such as ‘m?¢?’ (which arise 
in the Higgs sector of the Standard Model, for instance). As we have seen, 
m? scales by m"? = m? f?, which is a rapid growth with f. If we imagine 
starting at a very high scale, such as 101? TeV and flowing down to 1 TeV, 
then the ‘initial’ value of m will have to be very finely ‘tuned’ in order to end 
up with a mass of order 1 TeV. Thus, in this picture, it seems unnatural to 
have scalar particles with masses much less than the physical cut-off scale, 
unless some symmetry principle ‘protects’ their light masses. We shall return 
to this problem in section 22.8.1. 

We now return to lattice QCD, with a brief survey of some of the impressive 
results now being obtained numerically. 


E 


16.5 Lattice QCD 
16.5.1 Introduction, and the continuum limit 


Let us begin by considering some numbers. The lattice must be large enough 
so that the spatial dimension R of the object we wish to describe — say the size 
of a hadron - fits comfortably inside it, otherwise the result will be subject 
to ‘finite size effects’ as the hypercube side length L is varied. We also need 
R > a, or else the granularity of the lattice resolution will become apparent. 
Further, as indicated earlier, we expect the mass m (which is of order R~+) 
to be very much less than a~t. Thus ideally we need 


a«RR-1/m-«L-Na (16.133) 


so that N must be large. For example, if N = 64 and a ^ 0.1fm the condition 
(16.133) would be reasonably satisfied by a light hadron mass. But remember 
that each field at each lattice point is an independent degree of freedom: deal- 
ing with integrals such as (16.87) presents a formidable numerical challenge. 

Ignoring any statistical inaccuracy, the results will depend on the param- 
eters g and N, where gr, is the bare lattice gauge coupling (we assume for 
simplicity that the quarks are massless). Despite the fact that gr, is dimen- 
sionless, we shall now see that its value actually controls the physical size of 
the lattice spacing a, as a result of renormalization effects. The computed 
mass of a hadron M, say, must be related to the only quantity with mass 
dimension, a~!, by a relation of the form 


M= - flg). (16.134) 


Thus in approaching the continuum limit a — 0, we shall also have to change 


16.5. Lattice QCD 183 


gu suitably, so as to ensure that M remains finite. This is, of course, quite 
analogous to saying that, in a renormalizable theory, the bare parameters of 
the theory depend on the momentum cut-off A in such a way that, as A > oo, 
finite values are obtained for the corresponding physical parameters (see the 
last paragraph of section 10.1.2, for example). In practice, of course, the 
extent to which the lattice ‘a’ can really be taken to be very small is severely 
limited by the computational resources available — that is, essentially, by the 
number of mesh points N. 
Equation (16.134) should therefore really read 


M = 5f (o.(a)) . (16.135) 


As a — 0, M should be finite and independent of a. However, we know that 
the behaviour of g(a) at small scales is in fact calculable in perturbation 
theory, thanks to the asymptotic freedom of QCD. This will allow us to deter- 
mine the form of f (gL), up to a constant, and lead to an interesting prediction 
for M (equations (16.141)-(16.142)). 

Differentiating (16.135) we find 


odM _ 1 1 df dgr(a) 


0 = — = -— -— 16.136 
de -hro (16.136) 
so that EE 
GJL\a 
—= i 16.1 
(eG) E — fiola) (16.137) 
Meanwhile, the scale dependence of g, is given (to one loop order) by 
dgr (a) Bo 3 
— Z 1 .1 
AOO (16.138) 


where the sign is the opposite of (15.47) since a ~ j^! is the relevant scale 
parameter here (compare the comments after equation (16.124)). The inte- 
gration of (16.138) requires, as usual, a dimensionful constant of integration 
(cf (15.53)): 
gia) —— 1 

Am Bo In(1/a?.A?) ` 
Equation (16.139) shows that gr, (a) tends logarithmically to zero as a — 0, as 


we expect from asymptotic freedom. Ay can be regarded as a lattice equivalent 
of the continuum Ays, and it is defined (at one loop order) by 


(16.139) 


1 27 
Ay = lim —e = : 16.140 

j asoa x ( iat) l ) 
Equation (16.140) may also be read as showing that the lattice spacing a must 
go exponentially to zero as gr, tends to zero. Higher-order corrections can of 
course be included. 


184 16. Lattice Field Theory, and the Renormalization Group Revisited 


In a similar way, integrating (16.137) using (16.138) gives, in (16.134), 


1 2T 
M = constant x |— exp (-x5)) 16.141 
l; Bog? ( ) 


= constant x AL. (16.142) 


Equation (16.141) is known as asymptotic scaling: it predicts how any physical 
mass, expressed in lattice units a~', should vary as a function of gr. The 
form (16.142) is remarkable, as it implies that all calculated masses must be 
proportional, in the continuum limit a — 0, to the same universal scale factor 
AL. 

How are masses calculated on the lattice? The principle is very similar to 
the way in which the ground state was selected out as 7; —^ —oo, Te => +00 in 
(16.78). Consider a correlation function for a scalar field, for simplicity: 


C(r) = (Nlex —0,7)6(0)]Q) 
= X (lom? e7. (16.143) 


n 


As T — œ, the term with the minimum value of E,, namely En = Mg, will 
survive; M, can be measured from a fit to the exponential fall-off as a function 
of T. 

The behaviour predicted by (16.141) and (16.142) can be tested in actual 
calculations. A quantity such as the p meson mass is calculated via a corre- 
lation function of the form (16.143), the result being expressed in terms of a 
certain number of lattice units a^! at a certain value of g. By comparison 
with the known p mass, a~! can be converted to GeV. Then the calculation 
is repeated for a different gi, value and the new a^! (GeV) extracted. A 
plot of ln[a ! (GeV)] versus 1/9? should then give a straight line with slope 
27/89 and intercept In Ar. Figure 16.9 shows such a plot, taken from Ellis 
et al. (1996), from which it appears that the calculations are indeed being 
performed close to the continuum limit. The value of Ay has been adjusted 
to fit the numerical data, and has the value Ar, — 1.74 MeV in this case. This 
may seem alarmingly far from the kind of value expected for Agcp, but we 
must remember that the renormalization schemes involved in the two cases are 
quite different. In fact, we may expect Agcp % 50A; (Montvay and Munster 
(1994), section 5.1.6). 


16.5.2 The static qq potential 


The calculations of m, represented in figure 16.9 were done in the quenched ap- 
proximation. As a first example of a calculation with dynamical (unquenched) 
fermions we show in figure 16.10 a lattice calculation of the static qq potential 
(Allton et al. 2002, UKQCD Collaboration ), using two degenerate flavours of 
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FIGURE 16.9 
In(a~! in GeV) plotted against 1/g?; figure from R K Ellis, W J Stirling and B 


R Webber (1996) QCD and Collider Physics, courtesy Cambridge University 
Press, as adapted from Allton (1995). 


riro 


FIGURE 16.10 

The static QCD potential, expressed in units of rg. The broken curve is the 
functional form (16.147). Figure reprinted with permission from C R Allton 
et al. (UKQCD Collaboration) Phys. Rev. D 65 054502 (2002). Copyright 
2002 by the American Physical Society. 
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dynamical quarks? on a 16? x 32 lattice. As usual, one dimensionful quantity 
has to be fixed in order to set the scale. In the present case this has been 
done via the scale parameter ro of Sommer (1994), defined by 


; dV 


ri — 
0 
dr 5... 


— 1.65. (16.144) 


0 


Applying (16.144) to the Cornell (Eichten et al. 1980) or Richardson (1979) 
phenomenological potentials gives rg ~ 0.49 fm, conveniently in the range 
which is well-determined by cé and bb data. The data are well described by 
the expression 


A 
V(r) 2Vo-or— —, (16.145) 
r 


where in accordance with (16.144) 


g-——, (16.146) 
TO 


and where Vo has been chosen such that V(ro) = 0. Thus (16.145) becomes 


roV (r) = (1.65 — A) (= - 1) —A (2 - 1) i (16.147) 


This is — up to a constant — exactly the functional form mentioned in chapter 
1, equation (1.33). The quantity Jc (there called b) is referred to as the 
‘string tension’, and has a value of about 465 MeV in the present calculations. 
Phenomenological models suggest a value of around 440 MeV (Eichten et al. 
1980). The parameter A is found to have a value of about 0.3. In lowest- 
order perturbation theory, and in the continuum limit, A would be given by 


one-gluon exchange as 


A= m (16.148) 


where u is some energy scale. This would give ag œ 0.22, a reasonable value 
for u ~ 3 GeV. Interestingly, the form (16.147) is predicted by the ‘universal 
bosonic string model’ (Lüscher et al. 1980, Lüscher 1981), in which A has the 
‘universal’ value 75 ~ 0.26. 

The existence of the linearly rising term with o > 0 is a signal for confine- 
ment, since — if the potential maintained this form — it would cost an infinite 
amount of energy to separate a quark and an antiquark. But at some point, 
enough energy will be stored in the ‘string’ to create a qq pair from the vac- 
uum: the string then breaks, and the two qq pairs form mesons. There is no 
evidence for string breaking in figure 16.10, but we must note that the largest 
distance probed is only about 1.3 fm. 


3Comparison with matched data in the quenched approximation revealed very little 
difference, in this case. 
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16.5.3 Calculation of a( M3) 


Our second example of a precision lattice calculation with dynamical quarks is 
the determination of a, (M2) by Davies et al. (2008) (HPQCD Collaboration). 
'The reported value is 

o (Mz) = 0.1183(8). (16.149) 


'The accuracy of this result is extremely impressive, and it implies that this 
determination is an important ingredient in the world average value quoted 
in (15.62). It is worth sketching some of the elements that went into this 
landmark calculation. 

The work used 12 gluon configurations from the MILC collaboration (Aubin 
et al. 2004), and built on a joint effort by several groups (see Davies et al. 
(HPQCD, UKQCD, MILC, and Fermilab collaborations) 2004). Vacuum po- 
larization effects from all three light quarks u, d and s were included, using a 
Symanzik-improved staggered-quark discretization, with rooting. The effects 
of c and b quarks were incorporated using perturbation theory. The strange 
quark mass was physical, while the u and d quark mass (set to be the same) 
was three times too large, but small enough for chiral perturbation theory 
(see chapter 18) to be reliable for extrapolating to the physical mass. 

There were 5 parameters: My = Md, Ms, Me, Mp and the bare QCD cou- 
pling gr, (or equivalently the lattice spacing a). The mass parameters were 
tuned to reproduce experimentally measured values of m2, 22, —m2 , mp and 
my respectively. The lattice spacing was adjusted to make the Y — Y" mass 
difference agree with experiment (Gray et al. 2005). With the free parameters 
all determined, the simulation accurately reproduced QCD, and predictions 
for physical quantities could proceed. En passant, we show in figure 16.11 re- 
sults obtained (Davies et al. 2004), divided by experimental results, for nine 
different quantities, with and without quark vacuum polarization (left and 
right panels respectively). The values on the left deviate from experiment by 
as much as 1096 — 1596; those on the right agree with experiment to within 
systematic and statistical errors of 396 or less. 

To extract a value of the coupling constant, the general strategy is to 
calculate (with the tuned simulation) a non-perturbative numerical value for 
a short-distance quantity, for which perturbation theory should be reliable. 
Then, by comparing the numerically computed value to the known perturba- 
tive expansion, a value of the coupling constant can be found. 

In this case, the quantities calculated were vacuum expectation values of 
small Wilson loop operators Wmn (and related quantities) where 


1 
Win = 3 (O|Re Te P exp|-ig. f A dz]|0), (16.150) 


where P denotes path ordering, A, = A/2- A, is the QCD (matrix-valued) 
vector potential, and the integral is over a closed ma x na rectangular path, 
not necessarily planar. The 1 x 1 Wilson loop is just the vev of the simple 
plaquette operator Up of section 16.2.3. 
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FIGURE 16.11 

Lattice QCD results divided by experimental results for nine different quan- 
tities, with and without quark vacuum polarization (left and right panels, 
respectively). Figure reprinted with permission from C T H Davies et al. 
(HPQCD Collaboration) Phys. Rev. Lett. 92 022001 (2004). Copyright 2004 
by the American Physical Society. 


In order to compare the numerical evaluation of (16.150) with perturbation 
theory, one has to decide what is a suitable expansion parameter. It was shown 
by Lepage and Mackenzie (1993) that the obvious first choice, the bare lattice 
coupling constant, is generally a poor one due to renormalization effects, even 
for short distance quantities. Instead, a renormalized coupling should be used 
— but this raises the questions of what renormalization scheme to adopt, and 
what scale at which to evaluate the (now running) coupling. In the present 
case, the scheme proposed by Brodsky, Lepage and Mackenzie (1983) was 
followed. It is defined in terms of the heavy quark potential V(q), and is 
called the ‘V-scheme’. The strong coupling in the V-scheme is defined by 


4 4nav (q) 


16.151 
E (16451) 


V(q) = 
with no higher-order corrections. 


The numerically calculated short-distance quantities Y ^) are therefore to 
be expanded as the series 


yr) — y Mar (d?) /a), (16.152) 
n=1 
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where ce and d(? are dimensionless constants independent of the lattice spac- 
ing a, but dependent on the particular Y^, and av (d(? /a) is the running 
QCD coupling in the V-scheme, with Ng = 3 light quark flavours. The per- 
turbative coefficients c? for the various Y's were computed using Feynman 
diagrams, for n < 3, for the same quark and gluon actions which were used to 
create the sets of gluon field configurations employed in the numerical evalua- 
tion of the Y’s. The renormalization scale d) /a varies for each short-distance 
quantity, being chosen according to the Lepage-Mackenzie (1993) prescription 
(or in some cases a more robust procedure due to Hornbostel, Lepage and 
Morningstar (2003)). 

There were 22 Y (?'s, each of which was analyzed separately, fitting the 
expansion (16.152) to the 12 values of that Y calculated using the 12 gluon 
configurations. In the simplest terms, the result of each such fit would be the 
value of ay at a particular scale, which was chosen to be ay(7.5 GeV). The 
values required at the scales ovy (d? /a;) were found by numerically integrat- 
ing the evolution equation (at four-loop order) for ay; here a; is the lattice 
spacing for each configuration (there were 6 different spacings). In fact, the 
fitting was more sophisticated, including further parameters related to vari- 
ous corrections; the interested reader can consult Davies et al. (2008) for the 
details. Having obtained ay(7.5 GeV), this was then converted to the MS 
scheme, using the relation (Brodsky, Lepage and Mackenzie 1983) 


av (pu) = asgs(e în). (16.153) 


Finally, the resultant ayg was evolved to MZ. The value (16.149) is the final 
result after performing a weighted average over the 22 separate determina- 
tions. A full discussion of the error estimate, which includes finite lattice 
spacing, finite lattice volume, and chiral extrapolation uncertainties, is given 
in Davies et al. (2008). 


16.5.4 Hadron masses 


For our last example of a precise lattice QCD calculation, it is appropriate to 
consider the mass spectrum of light hadrons. After all, protons and neutrons 
account for nearly all the mass of ordinary matter, and 95% of their mass is 
the result of QCD interactions. It has long been a fundamental challenge to 
predict hadron masses accurately from QCD. 

As one example of such calculations, we show in figure 16.12 the light 
hadron spectrum of QCD as reported by Dürr et al. (2008). Horizontal lines 
and bands are the experimental values (which have been isospin-averaged) 
with their decay widths. The solid circles are the predicted values. Vertical 
error bars represent combined statistical and systematic error estimates. The 
masses of the 7, K and = have no error bars, because they have been used to 
set the values of Mu = ma, ms and the overall scale, respectively. Once again, 
the agreement with experiment is very impressive. 
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FIGURE 16.12 
The light hadron spectrum of QCD, from Dürr et al. (2008). (See color plate 
IL) 


These calculations used a Symanzik-improved gauge action (Lüscher and 
Weisz 1985), and 2+1 flavours of light dynamical Wilson fermions, with var- 
ious improvements (Morningstar and Peardon 2004). The physical scale was 
set either by fitting to the mass of the =, or to the mass of the Q; the two 
ways gave consistent results. Pion masses in the range (approximately) 800 
MeV to 190 MeV were used to extrapolate to the physical value, with lat- 
tice sizes approximately four times the inverse pion mass. A particular type 
of finite-volume effect arises in the case of strongly decaying resonant states: 
a procedure for reconstructing the infinite-volume resonance mass, given by 
Lüscher (1986, 1991a, 1991b), was followed here. This was satisfactory, ex- 
cept for the p and A at the lightest pion mass point, which was omitted from 
the extrapolation for these two channels. For further details, and additional 
references, we refer the reader to the supplementary material to Durr et al. 
(2008) provided online. 

We have been able to give only a brief introduction into what is now, almost 
forty years after its initial inception by Wilson (1974), the highly mature field 
of lattice QCD. A great deal of effort has gone into ingenious and subtle 
improvements to the lattice action, to the numerical algorithms, and to the 
treatment of fermions — to name a few of the issues. Lattice QCD is now 
a major part of particle physics. From the perspective of this chapter and 
the previous one, we can confidently say that, both in the short-distance 
(perturbative) regime, and in the long-distance (non-perturbative) regime, 
QCD is established as the correct theory of the strong interactions of quarks, 
beyond reasonable doubt. 
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Problems 

16.1 Verify equation (16.9). 

16.2 Verify equation (16.10). 

16.3 Show that the momentum space version of (16.18) is (16.19). 

16.4 Use (16.31) in (16.33) to verify (16.34). 

16.5 Verify (16.68) and (16.70). 


16.6 In à modified one-dimensional Ising model, spin variables s, at sites 
labelled by n = 1,2,3,...N take the values s, = +1, and the energy of each 


spin configuration is 
N-1 
= 1 In8n8n41 , 
n=1 


where all the constants Jn are positive. Show that the partition function Zy 
is given by 


N- 
-2 TT (2 cosh Kn) 


where K,, = Jn/kgT. Hence calculate the entropy for the particular case in 
which all the J,'s are equal to J and N > 1, and discuss the behaviour of 
the entropy in the limits T —> oo and T > 0. 

Let ‘p’ denote a particular site such that 1 « p « N. Show that the 
average value (55551) of the product spsp+1 is given by 


1 OZN 
(SpSp41) = Zn 8K, 


Show further that 


1 0) ZN 
(Sp8ptj) = Tx OK ð Kp 1... ÖKp pj 
Hence show that in the case Jı = Jg =... = Jy =J, 
(5pSp+j) = or 
where 


€ = —a/[In(tanh K)] , 


and K = J/kgT. Discuss the physical meaning of £, considering the T — oo 
and T — 0 limits explicitly. 
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17 


Spontaneously Broken Global Symmetry 


Previous chapters have introduced the non-Abelian symmetries SU(2) and 
SU(3) in both global and local forms, and we have seen how they may be 
applied to describe such typical physical phenomena as particle multiplets, 
and massless gauge fields. Remarkably enough, however, these symmetries 
are also applied, in the Standard Model, in two cases where the physical 
phenomena appear to be very different. Consider the following two questions: 
(i) Why are there no signs in the baryonic spectrum, such as parity doublets 
in particular, of the global chiral symmetry introduced in section 12.3.2? (ii) 
How can weak interactions be described by a local non-Abelian gauge theory 
when we know the mediating gauge field quanta are not massless? The answers 
to these questions each involve the same fundamental idea, which is a crucial 
component of the Standard Model, and perhaps also of theories which go 
beyond it. This is the idea that a symmetry can be ‘spontaneously broken’, 
or ‘hidden’. By contrast, the symmetries considered hitherto may be termed 
‘manifest symmetries’. 

The physical consequences of spontaneous symmetry breaking turn out to 
be rather different in the global and local cases. However, the essentials for 
a theoretical understanding of the phenomenon are contained in the simpler 
global case, which we consider in this chapter. The application to sponta- 
neously broken chiral symmetry will be treated in chapter 18, and sponta- 
neously broken local symmetry will be discussed in chapter 19, and applied in 
chapter 22. 


EE: SeSe 


17.1 Introduction 


We begin by considering, in response to question (i) above, what could go 
wrong with the argument for symmetry multiplets that we gave in chapter 12. 
To understand this, we must use the field theory formulation of section 12.3, 
in which the generators of the symmetry are Hermitian field operators, and 
the states are created by operators acting on the vacuum. Thus consider two 
states |A), |B)! : 

|A) = 6410), — 1B) = 950) (17.1) 


1 We now revert to the ordinary notation |0) for the vacuum state, rather than |Q}, but 
it must be borne in mind that |0) is the full (interacting) vacuum. 
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where ol, and $b are related to each other by (cf (12.100)) 
[2,64] = bf (17.2) 
for some generator Q of a symmetry group, such that 
[Q, H] = 0. (17.3) 
(17.2) is equivalent to 
UdlU- e ol, + ied (17.4) 


for an infinitesimal transformation U z 1+ ieQ. Thus ol, is ‘rotated’ into à 
by U, and the operators will create states related by the symmetry transfor- 
mation. We want to see what are the assumptions necessary to prove that 


E4- Ep, where H\|A)=E,|A) and A|B) = Eg|B). (17.5) 
We have . s mu "T 
Ep|B) = H|B) = 650) = H(Q¢), — $4Q)|0). (17.6) 


Now if 
Q|0) =0 (17.7) 


we can rewrite the right-hand side of (17.6) as 


ÊG = QH¢'\|0) using (17.3) = QH\A) = EAQ|A) 
= E,Q¢',|0) = Ea(o + 64,Q)|0) using (17.2) 
EA|B) if (17.7) holds; (17.8) 


whence, comparing (17.8) with (17.6), we see that 
Ej- Ep if (17.7) holds. (17.9) 
Remembering that Ü- exp(iaQ), we see that (17.7) is equivalent to 
\0)’ = U|0) = |0}. (17.10) 


Thus a multiplet structure will emerge provided that the vacuum is left in- 
variant under the symmetry transformation. The ‘spontaneously broken sym- 
metry’ situation arises in the contrary case — that is, when the vacuum is not 
invariant under the symmetry, which is to say when 


Q|0) £0. (17.11) 


In this case, the argument for the existence of symmetry multiplets breaks 
down, and although the Hamiltonian or Lagrangian may exhibit a non-Abelian 
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symmetry, this will not be manifested in the form of multiplets of mass- 
degenerate particles. 

The preceding italicized sentence does correctly define what is meant by 
a spontaneously broken symmetry in field theory, but there is another way of 
thinking about it which is somewhat less abstract though also less rigorous. 
The basic condition is Q|0) Z 0, and it seems tempting to infer that, in this 
case, the application of Q to the vacuum gives, not zero, but another possible 
vacuum, |0)' Thus we have the physically suggestive idea of ‘degenerate 
vacua’ (they must be degenerate since [Q, H] = 0). We shall see in a moment 
why this notion, though intuitively helpful, is not rigorous. 

It would seem, in any case, that the properties of the vacuum are all- 
important, so we begin our discussion with a somewhat formal, but nonethe- 
less fundamental, theorem about the quantum field vacuum. 


E 


17.2 The Fabri—Picasso theorem 


Suppose that a given Lagrangian £ is invariant under some one- parameter 
continuous global internal symmetry with a conserved Noether current J", 
such that ð, Jt = = 0. The associated ‘charge’ is the Hermitian operator Ô = 


f j9d?z, and Q = 0. We have hitherto assumed that the transformations of 
such a U(1) group are representable in the space of physical states by unitary 
operations Ü (A) = exp iAQ for arbitrary A, with the vacuum invariant under 
Ü, so that Q|0) = 0. Fabri and Picasso (1966) showed that there are actually 
two possibilities: 


(i) QI|0) = 0, and |0) is an eigenstate of Q with eigenvalue 0, so that 
|0} is invariant under U (i.e. U]0) = |0)); 
or 


(ii) Q|0) does not exist in the space (its norm is infinite). 


The statement (ii) is technically more correct than the more intuitive state- 
ments ‘Q|0) 4 0’ or ‘U|0) = |0)^, suggested above. 

To prove this result, consider the vacuum matrix element (0|j°(x)Q|0). 
From translation invariance, implemented by the unitary operator? U (x) = 
expiP - x (where P" is the 4-momentum operator) we obtain 


(0*()90) = (oe^7j^(0)e^*Ql0) 
- (Ole? 23 °(0)Qe?*|0) 
?]f this seems unfamiliar, it may be regarded as the 4-dimensional generalization of the 


transformation (L7) in appendix I of volume 1, from Schródinger picture operators at t — 0 
to Heisenberg operators at t Æ 0. 
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where the second line follows from 
[P#,Q] =0 (17.12) 


since Q is an internal symmetry. But the vacuum is an eigenstate of P^ with 
eigenvalue zero, and so 


(019°(x)Q|0) = (015*(0)Q]0) (17.13) 


which states that the matrix element we started from is in fact independent 
of z. Now consider the norm of Q|0): 


(0100) 


Il 


EEUU (17.14) 
= J čz oo, (17.15) 


which must diverge in the infinite volume limit, unless Q|0) = 0. Thus either 
Q|0) = 0 or QJO) has infinite norm. The foregoing can be easily generalized 
to non-Abelian symmetry operators qu 

Remarkably enough, the argument can also, in a sense, be reversed. Cole- 
man (1966) proved that if an operator 


Q(t) = fèro (17.16) 


is the spatial integral of the u = 0 component of a 4-vector (but not assumed 
to be conserved), and if it annihilates the vacuum 


Q(t)|0) = 0, (17.17) 


then in fact NI = 0, Q is independent of t, and the symmetry is unitarily 
implementable by operators U = exp(iAQ). 

We might now simply proceed to the chiral symmetry application. We 
believe, however, that the concept of spontaneous symmetry breaking is so 
important to particle physics that a more extended discussion is amply justi- 
fied. In particular, there are crucial insights to be gained by considering the 
analogous phenomenon in condensed matter physics. After a brief look at the 
ferromagnet, we shall describe the Bogoliubov model for the ground state of a 
superfluid, which provides an important physical example of a spontaneously 
broken global Abelian U(1) symmetry. We shall see that the excitations away 
from the ground state are massless modes and we shall learn, via Goldstone’s 
theorem, that such modes are an inevitable result of spontaneously breaking a 
global symmetry. Next, we shall introduce the ‘Goldstone model’ which is the 
simplest example of a spontaneously broken global U(1) symmetry, involving 
just one complex scalar field. The generalization of this to the non-Abelian 
case will draw us in the direction of the Higgs sector of the Standard Model. 
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Returning to condensed matter systems, we introduce the BCS ground state 
for a superconductor, in a way which builds on the Bogoliubov model of a 
superfluid. We are then prepared for the application, in chapter 18, to spon- 
taneous chiral symmetry breaking (question (i) above), following Nambu’s 
profound analogy with one aspect of superconductivity. In chapter 19 we 
shall see how a different aspect of superconductivity provides a model for the 
answer to question (ii) above. 


E EELLELEÉÉTÉTTETTLTETTTTETIÉTTIÉTIÉTÉTIEIIIÍÍIIIIIIÍIÍIÍIÍII 


17.3 Spontaneously broken symmetry in condensed 
matter physics 


17.3.1 The ferromagnet 


We have seen that everything depends on the properties of the vacuum state. 
An essential aid to understanding hidden symmetry in quantum field theory 
is provided by Nambu’s (1960) remarkable insight that the vacuum state of a 
quantum field theory is analogous to the ground state of an interacting many- 
body system. It is the state of lowest energy — the equilibrium state, given 
the kinetic and potential energies as specified in the Hamiltonian. Now the 
ground state of a complicated system (for example, one involving interacting 
fields) may well have unsuspected properties — which may, indeed, be very 
hard to predict from the Hamiltonian. But we can postulate (even if we 
cannot yet prove) properties of the quantum field theory vacuum |0) which 
are analogous to those of the ground states of many physically interesting 
many-body systems — such as superfluids and superconductors, to name two 
with which we shall be principally concerned. 

Now it is generally the case, in quantum mechanics, that the ground state 
of any system described by a Hamiltonian is non-degenerate. Sometimes we 
may meet systems in which apparently more than one state has the same 
lowest energy eigenvalue. Yet in fact none of these states will be the true 
ground state: tunnelling will take place between the various degenerate states, 
and the true ground state will turn out to be a unique linear superposition 
of them. This is, in fact, the only possibility for systems of finite spatial 
extent, though in practice a state which is not the true ground state may 
have an extremely long lifetime. However, in the case of fields (extending 
presumably throughout all space), the Fabri-Picasso theorem shows that there 
is an alternative possibility, which is often described as involving a ‘degenerate 
ground state’ — a term we shall now elucidate. In case (a) of the theorem, the 
ground state is unique. For, suppose that several ground states |0, a), |0, b}, . . . 
existed, with the symmetry unitarily implemented. Then one ground state will 
be related to another by 


|0, a) = e? 1o, b) (17.18) 
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for some A. However, in case (a) the charge annihilates a ground state, and 
so all of them are really identical. In case (b), on the other hand, we cannot 
write (17.18) — since Q|0) does not exist — and we do have the possibility 
of many degenerate ground states. In simple models one can verify that 
these alternative ground states are all orthogonal to each other, in the infinite 
volume limit — or perhaps more physically, the limit in which the number 
of degrees of freedom becomes infinite. And each member of every ‘tower’ 
of excited states, built on these alternative ground states, is also orthogonal 
to all the members of other towers. But any single tower must constitute a 
complete space of states. It follows that states in different towers belong to 
different complete spaces of states, that is to different — and inequivalent — 
*worlds', each one built on one of the possible orthogonal ground states. 

At first sight, a familiar example of these ideas seems to be that of a fer- 
romagnet, below its Curie temperature 7c. Consider an ‘ideal Heisenberg 
ferromagnet! with N atoms each of spin 1 ^ 2, described by a Hamiltonian of 
Heisenberg exchange form Hg = —J»; S; S;, where i and j label the atomic 
sites. This Hamiltonian is invariant under spatial rotations, since it only 
depends on the dot product of the spin operators. Such rotations are im- 
plemented by unitary operators exp(iS - o) where S= >>, $i, and spins at 
different sites are assumed to commute. As usual with angular momentum 
in quantum mechanics, the eigenstates of Hs are labelled by the eigenvalues 
of total squared spin, and of one component of spin, say of ô, = ae Si. 
The quantum mechanical ground state of Hg is an eigenstate with total spin 
quantum number S = N/2, and this state is (2- N/2 + 1) = (N + 1)— fold 
degenerate, allowing for all the possible eigenvalues (N/2, N/2— 1,...— N/2) 
of $, for this value of S. We are free to choose any one of these degenerate 
states as ‘the’ ground state, say the state with eigenvalue S, = N/2. 

It is clear that the ground state is not invariant under the spin-rotation 
symmetry of Hg, which would require the eigenvalues S = S, = 0. Further- 
more, this ground state is degenerate. So two important features of what 
we have so far learned to expect of a spontaneously broken symmetry are 
present — namely, ‘the ground state is not invariant under the symmetry of 
the Hamiltonian’, and ‘the ground state is degenerate’. However, it has to 
be emphasized that this ferromagnetic ground state does, in fact, respect the 
symmetry of Hs, in the sense that it belongs to an irreducible representation 
of the symmetry group: the unusual feature is that it is not the ‘trivial’ (sin- 
glet) representation, as would be the case for an invariant ground state. The 
spontaneous symmetry breaking which is the true model for particle physics 
is that in which a many body ground state is not an eigenstate (trivial or 
otherwise) of the symmetry operators of the Hamiltonian: rather it is a su- 
perposition of such eigenstates. We shall explore this for the superfluid and 
the superconductor in due course. 

Nevertheless, there are some useful insights to be gained from the ferro- 
magnet. First, consider two ground states differing by a spin rotation. In the 
first, the spins are all aligned along the 3-axis, say, and in the second along 
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the axis ft = (0,sina,cosa). Thus the first ground state is 


meros sd o: (N products) (17.19) 


while the second is (cf (4.31), (4.32)) 


(a) _ ( cosa/2 cos a/2 
Xo = isina/2 ee ( isina/2 ) (H 


The scalar product of (17.19) and (17.20) is (coso//2)" , which goes to zero as 
N — oo. Thus any two such ‘rotated ground states’ are indeed orthogonal in 
the infinite volume (or infinite number of degrees of freedom) limit. 

We may also enquire about the excited states built on one such ground 
state, say the one with S. eigenvalue N/2. Suppose for simplicity that 
the magnet is one-dimensional (but the spins have all three components). 
Consider the state xy, = = Sy Xo where S,— is the spin lowering operator 
Sn = (Sas —iSny) at site n, such that 


Sec. ( : ).- ( : ): (17.21) 


so S, Xo differs from the ground state xo by having the spin at site n flipped. 
The action of Hs on x, can be found by writing 


s 5 (P x 2 A A A 
5 S;- S; = 5 3 Si-Si+ + S; Si+) + Siz S52 (17.22) 
izj izj 
(remembering that spins on different sites commute), where ex = oe + 


iS;,. Since all S;; operators give zero on a spin ‘up’ state, the only non-zero 
contributions from the first (bracketed) term in (17.22) come from terms in 
which either $; i+ OF $; j4. act on the ‘down’ spin at n, so as to restore it to ‘up’. 

The ‘partner’ operator $;— (or $5.) then simply lowers the spin at i (or j), 
leading to the result 


fuss cx d 
, 3 (Si- S+ + Sj Si+)Xn = , Xi- (17.23) 
ij izn 


Thus the state Xn is not an eigenstate of Hg. However, a little more work 
shows that the superpostitions 


Xq= = 2 elgnay (17.24) 


are eigenstates; here q is one of the discretized wavenumbers produced by 
appropriate boundary conditions, as is usual in one-dimensional ‘chain’ prob- 
lems. The states (17.24) represent spin waves, and they have the important 
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feature that for low q (long wavelength) their frequency w tends to zero with 
q (actually w c q?). In this respect, therefore, they behave like massless par- 
ticles when quantized — and this is another feature we should expect when a 
symmetry is spontaneously broken. 

The ferromagnet gives us one more useful insight. We have been assuming 
that one particular ground state (e.g. the one with S; = N/2) has been some- 
how 'chosen'. But what does the choosing? The answer to this is clear enough 
in the (perfectly realistic) case in which the Hamiltonian Hg is supplemented 
by a term —guB 7; Sio. representing the effect of an applied field B directed 
along the z-axis. This term will indeed ensure that the ground state is unique, 
and has S, = N/2. Consider now the two limits B > 0 and N — oo, both 
at finite temperature. When B — 0 at finite N, the N +1 different S, eigen- 
states become degenerate, and we have an ensemble in which each enters with 
an equal weight; there is therefore no loss of symmetry, even as N — oo (but 
only after B — 0). On the other hand, if N — oo at finite B # 0, the single 
state with S, = N/2 will be selected out as the unique ground state and this 
asymmetric situation will persist even in the limit B — 0. In a (classical) 
mean field theory approximation we suppose that an ‘internal field’ is ‘spon- 
taneously generated', which is aligned with the external B and survives even 
as B — 0, thus ‘spontaneously’ breaking the symmetry. 

The ferromagnet therefore provides an easily pictured system exhibiting 
many of the features associated with spontaneous symmetry breaking; most 
importantly, it strongly suggests that what is really characteristic about the 
phenonenon is that it entails spontaneous ordering'.? Generally such ordering 
occurs below some characteristic ‘critical temperature’, Tc. The field which 
develops a non-zero equilibrium value below Tc is called an ‘order parame- 
ter’. This concept forms the basis of Landau's theory of second-order phase 
transitions (see for example chapter XIV of Landau and Lifshitz 1980). 

We now turn to an example much more closely analogous to the particle 
physics applications: the superfluid. 


17.3.2 The Bogoliubov superfluid 
Consider the non-relativistic Hamiltonian (in the Schródinger picture) 


= gn { devi vé 
ex f [ewe yr(ie- iwda) aas) 


where óf (a) creates a boson of mass m at position æ. This H describes 
identical bosons interacting via a potential v, which is assumed to be weak 
(see, for example, Schiff 1968 section 55, or Parry 1973 chapter 1). We note 


3]t is worth pausing to reflect on the idea that ordering is associated with symmetry 
breaking. 
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at once that H is invariant under the global U(1) symmetry 


$(z) + ĝ' (æ) = e'*6(a), (17.26) 


the generator being the conserved number operator 
N= [as Pa (17.27) 


which obeys [N,H] = 0. Our ultimate concern will be with the way this 
symmetry is ‘spontaneously broken’ in the superfluid ground state. Naturally, 
since this is an Abelian, rather than a non-Abelian, symmetry the physics will 
not involve any (hidden) multiplet structure. But the nature of the ‘symmetry 
breaking ground state’ in this U(1) case (and in the BCS model of section 17.7) 
will serve as a physical model for non-Abelian cases also. 

We begin by re-writing H in terms of mode creation and annihilation 
operators in the usual way. We expand ó(z) as a superposition of solutions of 
the v = 0 problem, which are plane waves quantized in a large cube of volume 
Q: 


(x) = 5 2 ay ci * (17.28) 


where àj|0) = 0, â} |0) is a one-particle state, and [à âl] = Op as With 
all other commutators vanishing. We impose periodic boundary conditions 


at the cube faces, and the free particle energies are e; = k? /2m. Inserting 
(17.28) into (17.25) leads (problem 17.1) to 


A n 1 4 xo AA ose 
H- 5 Endy ap den M o(\ka — ki |â} di Op âp Alka + kə — ki — ky) 
k A 


(17.29) 
where the sum is over all momenta kj, k2, k1, ky subject to the conservation 
law imposed by the A function: 

A(k) — 1 ifk=0 (17.30) 
0 if k # 0. (17.31) 
The interaction term in (17.29) is easily visualized as in figure 17.1. A pair 


of particles in states k1, kh is scattered (conserving momentum) to a pair in 
states kı, kə via the Fourier transform of v: 


a(|kl) = J o(rje tk? gp, (17.32) 


Now, below the superfluid transition temperature Ts, we know that in the 
limit as v — 0 the ground state has all the particles ‘condensed’ into the lowest 
energy state, which has k = 0. Thus the ground state will be proportional to 


|N, 0) = (a5) Jo). (17.33) 
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FIGURE 17.1 
The interaction term in (17.29). 


When a weak repulsive v is included, it is reasonable to hope that most of 
the particles remain in the condensate, only relatively few being excited to 
states with k Æ 0. Let No be the number of particles with k = 0, where by 
assumption No ~ N. We now consider the limit N (and No) — oo and Q — oo 
such that the density o = N/Q (and po = No/Q) stays constant. Bogoliubov 
(1947) argued that in this limit we may effectively replace both @ and âl in 
the second term in (17.29) by the number No /? This amounts to saying that 
in the commutator i E "e : 
âo à âl à 

TATA PTAA (17.34) 
the two terms on the left-hand side are each of order No/Q and hence finite, 
while their difference may be neglected as €) — oo. Replacing Go and âl by 


No /2 leads (problem 17.2) to the following approximate form for H: 


a 2 Pd: 1N? 
H x Hg = 5 âĉpâôkEk + 20 v(0) 

k 

1 DN ut ded 
H3 De q Dana! y + ágà. kl (17.35) 
k 
where 
N_ 
Ey = €k + qi. (17.36) 


primed summations do not include k — 0, and terms which tend to zero as 
Q — oo have been dropped (thus, No has been replaced by N). 

The most immediately striking feature of (17.35), as compared with H of 
(17.29), is that Hg does not conserve the U(1) (number) symmetry (17.26) 
while H does: it is easy to see that for (17.26) to be a good symmetry, the 
number of @’s must equal the number of á!'s in every term. Thus the ground 
state of Hp, |ground)p, cannot be expected to be an eigenstate of the number 


17.3. Spontaneously broken symmetry in condensed matter physics 205 


operator. However, it is important to be clear that the number non-conserving 
aspect of (17.35) is of a completely different kind, conceptually, from that 
which would be associated with a (hypothetical) ‘explicit’ number violating 
term in the original Hamiltonian — for example, the addition of a term of the 
form ‘âtââ’. In arriving at (17.35), we effectively replaced (17.28) by 


" 1 m 
p(x) = pl? + qui X pue (17.37) 
kz0 


where po = No/Q, No ~ N, and No/Q remains finite as Q — oo. The limit is 
crucial here: it enables us to picture the condensate No as providing an infinite 
reservoir of particles, with which excitations away from the ground state can 
exchange particle number. From this point of view, a number non-conserving 
ground state may appear more reasonable. The ultimate test, of course, is 
whether such a state is a good approximation to the true ground state, for a 
large but finite system. 

What is |ground)g? Remarkably, Hp can be exactly diagonalized by means 
of the Bogoliubov quasiparticle operators (for k 4 0) 


Ap = fap + già! y. ái, = fray, + giá! y (17.38) 


where f; and gi are real functions of k = |k|. We must again at once draw 
attention to the fact that this transformation does not respect the symmetry 
(17.26) either, since Gp, — e !^àj while aly. > pals. In fact, the op- 


^T 
erators à k 


will turn out to be precisely creation operators for quasiparticles 
which exchange particle number with the ground state. 


The commutator of â% and âl, is easily evaluated: 


âp ài] = fe — gi; (17.39) 


while two á's or two ât’s commute. We choose f; and gp such that f2—9? = 1, 
so that the @’s and the á's have the same (bosonic) commutation relations, 
and the transformation (17.38) is ‘canonical’. A convenient choice is f; = 
cosh 0, gk = sinh 0k. We now assert that Hp can be written in the form 


Hp =X wd) dy, + B (17.40) 
k 


for certain wj and 8. Equation (17.40) implies, of course, that the eigenvalues 
of Hp are B + (n + 1/2)wk, and that âl acts as the creation operator for 
the quasiparticle of energy wz, as just anticipated. 

We verify (17.40) slightly indirectly. We note first that it implies that 


[Hp, à]] = wid. (17.41) 
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Substituting for áj from (17.38), we require 


[Hp, cosh 6; àj + sinh 0; à. |] = wi (cosh 6; a; + sinh 6r. |), (17.42) 


l 
which must hold as an identity in the à's and ât’s. Using the expression (17.35) 
for Hp, and some patient work with the commutation relations (problem 17.3), 
one finds 


N 
(w, — Ej) cosh@; + qi sinh 6; = 0 (17.43) 
N 
ql) coshð, — (or Ei) sinh & = 0. (17.44) 


For consistency, therefore, we require 


EB? — yw? — (=) (a(l)? = 0, (17.45) 


or (recalling the definitions of E; and ej) 


w= B (5 eq) (17.46) 


2m V 2m 


where p = N/Q. The value of tanh 6; is then determined via either of (17.43), 
(17.44). 

Equation (17.46) is an important result, giving the frequency as a function 
of the momentum (or wavenumber); it is an example of a 'dispersion relation'. 
At the risk of stating the obvious, let us emphasize that equation (17.40) tells 
us that the original system of interacting bosons is equivalent (under the 
approximations made) to a system of non-interacting quasiparticles, whose 
frequency w; is related to wavenumber by (17.46). These are the true modes 
of the system. Let us consider this dispersion relation. 

First of all, in the non-interacting case v — 0, we recover the usual 
frequency-wavenumber relation for a massive non-relativistic particle, w, = 
l?/2m. But if 9(0) 4 0, the behaviour at small I is very different: wi 7 c;|l|, 
where c, — (pv(0)/ m)" ?. This dispersion relation is characteristic of a mass- 
less mode, but in this case it is sound rather than light, with speed of sound 
cs. The spectrum is therefore phonon-like, not (non-relativistic) particle-like. 
'The two behaviours can be easily distinguished experimentally, by measuring 
the low-temperature specific heat: in three dimensions, for w; ~ l° it goes 
to zero as T?/2, whereas for wi ~ |l| it goes as T3. The latter behaviour 
is observed in superfluids. At large values of |l|, however, w; behaves essen- 
tially like 1? /2m and the spectrum returns to the ‘particle-like’ one of massive 
bosons. Thus (17.46) interpolates between phonon-like behaviour at small |I| 
and particle-like behaviour at large |l]. 
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There is still more to be learned from (17.46). If, in fact, (|l) ~ 1/V, 
then w, — constant as ||| > 0, and the spectrum would not be phonon-like. 
Indeed, if o(|l|) ~ e? /l?, then wi ~ |e|(e/m)'/? for small |l], which is just the 
‘plasma frequency’ wp. In particle physics terms, this would be analogous to 
a dispersion relation of the form w; ~ (w? + 1?)1/2 which describes a particle 
with mass wp. Such a v is, of course, Colombic (the Fourier transform of 
€? /|z|), indicating that in the case of such a long-range force the frequency 
spectrum acquires a mass-gap. This will be the topic of chapter 19. 

Having discussed the spectrum of quasiparticle excitations, let us now 
concentrate on the ground state in this model. From (17.40), it is clear that 
it is defined as the state |ground)p such that 


áp |ground)g = 0 for all k Z 0; (17.47) 


ie. as the state with no non-zero-momentum quasiparticles in it. This is à 
complicated state in terms of the original ay, and à, operators, but we can 
give a formal expression for it, as follows. Since the @’s and @’s are related by 
a canonical transformation, there must exist a unitary operator Up such that 


âk = ÜpügÜs ', ap = U3 à Un. (17.48) 
Now we know that àj,|0) = 0. Hence it follows that 
áj Ün|0) = 0, (17.49) 


and we can identify |ground)g with Üp|0). In problem 17.4, Üg is evaluated for 
an Hg consisting of a single k-mode only, in which case the operator effecting 
the transformation analogous to (17.48) is Uy = exp[8(àà — atat)/2] where 
0 replaces 6; in this case. This generalizes (in the form of products of such 
operators) to the full Hg case, but we shall not need the detailed result; an 
analogous result for the BCS ground state is discussed more fully in section 
17.7. The important point is the following. It is clear from expanding the 
exponentials that Up creates a state in which the number of a-quanta (i.e. the 
original bosons) is not fixed. Thus unlike the simple non-interacting ground 
state |N,0) of (17.33), |ground)g = Ug]0) does not have a fixed number of 
particles in it: that is to say, it is not an eigenstate of the symmetry operator 
N, as anticipated in the comment following (17.36). This is just the situation 
alluded to in the paragraph before equation (17.19), in our discussion of the 
ferromagnet. 

Consider now the expectation value of (a) in any state of definite particle 
number - that is, in an eigenstate of the symmetry operator Ñ. It is easy to 
see that this must vanish (remember that ¢ destroys a boson, and so @|N) is 
proportional to |N — 1), which is orthogonal to |N)). On the other hand, this 
is not true of dp (a): for example, in the non-interacting ground state (17.33), 
we have y 

(N, OlĝB(2)|N,0) = p0”. (17.50) 
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Furthermore, using the inverse of (17.38) 
Gp, = cosh 0,45, — sinh 6.4" p (17.51) 
together with (17.47), we find the similar result: 
s (ground|ón (a)|ground)g = p4. (17.52) 


The question is now how to generalize (17.50) or (17.52) to the complete ¢(a) 
and the true ground state |ground), in the limit N,Q — oo with fixed N/Q. 
We make the assumption that 


(ground|¢(a)|ground) Æ 0; (17.53) 


that is, we abstract from the Bogoliubov model the crucial feature that the 
field acquires a non-zero expectation value in the ground state, in the infinite 
volume limit. 

We are now at the heart of spontaneous symmetry breaking in field theory. 
Condition (17.53) has the form of an ‘ordering’ condition: it is analogous to 
the non-zero value of the total spin in the ferromagnetic case, but in (17.53) 
— we must again emphasize — |ground) is not an eigenstate of the symmetry 
operator N; if it were, (17.53) would vanish, as we have just seen. Recall- 
ing the association ‘quantum vacuum + many body ground state’ we expect 
that the occurrence of a non-zero vacuum expectation value (vev) for an op- 
erator transforming non-trivially under a symmetry operator will be the key 
requirement for spontaneous symmetry breaking in field theory. Such opera- 
tors are generically called order parameters. In the next section we show how 
this requirement necessitates one (or more) massless modes, via Goldstone’s 
theorem (1961). 

Before leaving the superfluid, we examine (17.37) and (17.52) in another 
way, which is only rigorous for a finite system but is nevertheless very sugges- 
tive. Since the original H has a U(1) symmetry under which ¢ transforms to 
à = exp(—ia)à, we should be at liberty to replace (17.37) by 


3 ia 1 ^ —ia ik- 
dig =e pal? + qu Dane "e kx (17.54) 
kz0 


But in that case our condition (17.52) becomes 
p (ground|ó5 |ground)g = e~'“g(ground|ép|ground) sg. (17.55) 
Now 9! = U,dUz! where Ü, = exp(iaN). Hence (17.55) may be written as 
p(ground|U.¢6U,, ! Jground)g = e'g(ground|¢g|ground) g. (17.56) 


If |ground)g were an eigenstate of Ñ with eigenvalue N, say, then the Us fac- 


tors in (17.56) would become just ei^ -e~i@ and would cancel out, leaving a 
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contradiction. Instead, however, knowing that |ground)p is not an eigen- 
state of N, we can regard UZ'|ground)p as an ‘alternative ground state’ 
|ground, a)p such that 


p (ground, a|ó|ground, a)g = ep (ground|óp |ground)g, (17.57) 


the original choice (17.52) corresponding to a = 0. There are infinitely many 
such ground states since a is a continuous parameter. No physical consequence 
follows from choosing one rather than another, but we do have to choose one, 
thus ‘spontaneously’ breaking the symmetry. In choosing say a = 0, we are 
deciding (arbitrarily) to pick the ground state such that p (ground|ó|ground)p 
is aligned in the ‘real’ direction. By hypothesis, a similar situation obtains 
for the true ground state. None of the states |ground, o) is an eigenstate for 
N: instead, they are certain coherent superpositions of states with different 
eigenvalues N, such that the expectation value of ó has a definite phase. 


E: SeSe 


17.4 Goldstone’s theorem 


We return to quantum field theory proper, and show following Goldstone 
(1961) (see also Goldstone, Salam and Weinberg 1962) how in case (b) of the 
Fabri-Picasso theorem massless particles will necessarily be present. Whether 
these particles will actually be observable depends, however, on whether the 
theory also contains gauge fields. In this chapter we are concerned solely with 
global symmetries, and gauge fields are absent; the local symmetry case is 
treated in chapter 19. 

Suppose, then, that we have a Lagrangian £ with a continuous symmetry 
generated by a charge Q, which is independent of time, and is the space 
integral of the u = 0 component of a conserved Noether current: 


Q- [into da. (17.58) 


We consider the case in which the vacuum of this theory is not invariant, i.e. 
is not annihilated by Q. 

Suppose ely) is some field operator which is not invariant under the con- 
tinuous symmetry in question, and consider the vacuum expectation value 


(Ol[Â, ó(v)]I0). (17.59) 


Just as in equation (17.13), translation invariance implies that this vev is, in 
fact, independent of y, and we may set y = 0. If Q were to annihilate |0}, the 
expression (17.18) would clearly vanish: we investigate the consequences of it 
not vanishing. Since ĝi is not invariant under Q, the commutator in (17.59) will 
give some other field, call it (y); thus the hallmark of the hidden symmetry 
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situation is the existence of some field (here ¢!(y)) with non-vanishing vacuum 
expectation value, just as in (17.53). 
From (17.58), we can write (17.59) as 


0 z (0|ó'(y)0) (17.60) 
= (of daiola), Stylo) (17.61) 
Since, by assumption, 0,3" — 0, we have as usual 
a | doit) + [ aieo =0, (17.62) 
whence 
a f dato. boo = - [ Px(olfaivs(), Gol). (17.63) 


is ues J aS-(0j(x), (yl. — (17.64) 


If the surface integral vanishes in (17.64), (17.61) will be independent of zo. 
The commutator in (17.64) involves local operators separated by a very large 
space-like interval, and therefore the vanishing of (17.64) would seem to be 
unproblematic. Indeed so it is — with the exception of the case in which the 
symmetry is local and gauge fields are present. A detailed analysis of exactly 
how this changes the argument being presented here will take us too far afield 
at this point, and the reader is referred to Guralnik et al. (1968) and Bernstein 
(1974). We shall treat the ‘spontaneously broken’ gauge theory case in chapter 
19, but in less formal terms. 

Let us now see how the independence of (17.61) on zo leads to the necessity 
for a massless particle in the spectrum. Inserting a complete set of states in 
(17.61), we obtain 


0 # J 9 Eolio) alto) = (0160) liole)l0) (17.65) 


- I de S~{ (Oj0(0)|n) (nlé)? — (0|4(y) In) (no (0)]0)e"77 ) 
(17.66) 


using translation invariance, with p, the 4-momentum eigenvalue of the state 
|n). Performing the spatial integral on the right-hand side we find (omitting 
the irrelevant (27)?) 


0 # JO EPa OO n) (nóty) — (0|ó(y) |n) (nljo(0)]0)e 779]. 


(17.67) 
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But this expression is independent of zo. Massive states |n) will produce 
explicit zo-dependent factors e*!M»o ( 


Pno — Mn as the 6-function constrains 
p,, = 0), hence the matrix elements of Jọ between |0} and such a massive state 
must vanish, and such states contribute zero to (17.67). Equally, if we take 
|n) = |0), (17.67) vanishes identically. But it has been assumed to be not zero. 
Hence some state or states must exist among |n) such that (0|jo|n) Z 0 and 
yet (17.67) is independent of rg. The only possibility is states whose energy 
pao goes to zero as their 3-momentum does (from 6°(p,,)). Such states are, 
of course, massless; they are called generically Goldstone modes. 'Thus the 
existence of a non-vanishing vacuum expectation value for a field, in a theory 
with a continuous symmetry, appears to lead inevitably to the necessity of 
having a massless particle, or particles, in the theory. This is the Goldstone 
result. 

'The superfluid provided us with an explicit model exhibiting the crucial 
non-zero expectation value (ground|ó|ground) Æ 0, in which the now expected 
massless mode emerged dynamically. We now discuss a simpler, relativistic 
model, in which the symmetry breaking is brought about more ‘by hand’ — 
that is, by choosing a parameter in the Lagrangian appropriately. Although 
in a sense less ‘dynamical’ than the Bogoliubov superfluid (or the BCS su- 
perconductor, to be discussed shortly) this Goldstone model does provide a 
very simple example of the phenomenon of spontaneous symmetry breaking 
in field theory. 


17.5 Spontaneously broken global U(1) symmetry: the 
Goldstone model 


We consider, following Goldstone (1961), a complex scalar field db as in sec- 
tion 7.1, with 


o= lh — ida), gi = (hn + ida), (17.68) 


described by the Lagrangian 
Le = (0,9!)(0"9) — V (4). (17.69) 


We begin by considering the ‘normal’ case in which the potential has the form 
AC ds ees M 
V = Vs = zX(910) + e'O (17.70) 
with 12, à > 0. The Hamiltonian density is then 


ig <b às vit-vàs tQ). (17.71) 
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Clearly £q is invariant under the global U(1) symmetry 
à — d! =e, (17.72) 


the generator being No of (7.23). We shall see how this symmetry may be 
‘spontaneously broken’. 

We know that everything depends on the nature of the ground state of 
this field system — that is, the vacuum of the quantum field theory. In gen- 
eral, it is a difficult, non-perturbative, problem to find the ground state (or a 
good approximation to it — witness the superfluid). But we can make some 
progress by first considering the theory classically. It is clear that the absolute 
minimum of the classical Hamiltonian Hg is reached for 


(i) 9 = constant, which reduces the ġ and V ó terms to zero; 


(ii) & = po where ġo is the minimum of the classical version of the 
potential, V. 


For V = Vs as in (17.70) but without the hats, and with A and u? both 
positive, the minimum of Vs is clearly at ¢ = 0, and is unique. In the quantum 
theory, we expect to treat small oscillations of the field about this minimum as 
approximately harmonic, leading to the usual quantized modes. 'To implement 
this, we expand db about the classical minimum at ¢ = 0, writing as usual 


" ak 
= | ———[al(k)e** + b (kje? 17.73 
b= [ ceto (ie) (17.73) 
where the plane waves are solutions of the ‘free’ (A = 0) problem. For A = 0 
the Lagrangian is simply 


hiss = 8,0! 0" Q — wd d, (17.74) 


which represents a complex scalar field, consisting of two degrees of freedom, 
each with the same mass ji (see section 7.1). Thus in (17.73) w = (k? + u?)!/?, 
and the vacuum is defined by 


a(k)|0) = 6(k)|0) = 0, (17.75) 


and so clearly E 
(0|¢|0) = 0. (17.76) 


It seems reasonable to interpret quantum field average values as corresponding 
to classical field values, and on this interpretation (17.76) is consistent with 
the fact that the classical minimum energy configuration has $ = 0. 

Consider now the case in which the classical minimum is not at o = 0. 
This can be achieved by altering the sign of u? in (17.70) ‘by hand’, so that 
the classical potential is now the ‘symmetry breaking’ one 


V = V = TAGO — pge. (17.77) 
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FIGURE 17.2 
The classical potential Vag of (17.77). 


This is sketched versus $1 and $$ in figure 17.2. This time, although the 
origin $1 = ¢2 = 0 is a stationary point, it is an (unstable) maximum rather 
than a minimum. The minimum of Vsp occurs when 


2 2 
(old) ==, (17.78) 
or alternatively when 
4u? 
$i + 6 = E A v? (17.79) 
where alu) 
u 


The condition (17.79) can also be written as 
|| = v/v2. (17.81) 


To have a clearer picture, it is helpful to introduce the ‘polar’ variables p(x) 
and 0(x) via 


(a) = (p(2)/ v2) expi&(z)/v) (17.82) 


where for convenience the v is inserted so that 0 has the same dimension 
(mass) as p and ¢. The minimum condition (17.81) therefore represents the 
circle p — v; any point on this circle, at any value of 0, represents a possible 
classical ground state — and it is clear that they are (infinitely) degenerate. 
Before proceeding further, we briefly outline a condensed matter analogue 
of (17.77) and (17.81) which may help in understanding the change in sign of 
the parameter u?. Consider the free energy F of a ferromagnet as a function 
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of the magnetization M at temperature T, and make an expansion of the 
form 


F x Fo(T) + p? (D) M? + ^M Tee (17.83) 


valid for weak and slowly varying magnetization. If the parameter u? is posi- 
tive, it is clear that F has a simple ‘bowl’ shape as a function of |M|, with a 
minimum at |M| = 0. This is the case for T greater than the ferromagnetic 
transition temperature Tc. However, if one assumes that u?(T) changes sign 
at Tc, becoming negative for T < Tc, then F will now resemble a vertical 
section of figure 17.2, the minimum being at |M| Z 0. Any direction of M 
is possible (only |M| is specified); but the system must choose one particular 
direction (e.g. via the influence of a very weak external field, as discussed in 
section 17.3.1), and when it does so the rotational invariance exhibited by F 
of (17.83) is lost. This symmetry has been broken ‘spontaneously’ — though 
this is still only a classical analogue. Nevertheless, the model is essentially 
the Landau mean field theory of ferromagnetism, and suggests that we should 
think of the ‘symmetric’ and ‘broken symmetry’ situations as different phases 
of the same system. It may also be the case in particle physics, that parame- 
ters such as u? change sign as a function of T, or some other variable, thereby 
effectively precipitating a phase change. 

If we maintain the idea that the vacuum expectation value of the quantum 
field should equal the ground state value of the classical field, the vacuum in 
this u2 < 0 case must therefore be |0)g such that g(0|9|0)g does not vanish, 
in contrast to (17.76). It is clear that this is exactly the situation met in the 
superfluid (but ‘B’ here will stand for ‘broken symmetry’), and is moreover 
the condition for the existence of massless (Goldstone) modes. Let us see how 
they emerge in this model. 

In quantum field theory, particles are thought of as excitations from a 
ground state, which is the vacuum. Figure 17.2 strongly suggests that if we 
want a sensible quantum interpretation of a theory with the potential (17.77), 
we had better expand the fields about a point on the circle of minima, about 
which stable oscillations are likely, rather than about the obviously unstable 
point ĝ = 0. Let us pick the point p = v, 0 = 0 in the classical case. We might 
well guess that ‘radial’ oscillations in 6 would correspond to a conventional 
massive field (having a parabolic restoring potential), while ‘angle’ oscillations 
in 6 — which pass through all the degenerate vacuua — have no restoring force 
and are massless. Accordingly, we set 


dyes Ae bye Coda) (17.84) 


and find (problem 17.5) that £c (with V = Vgp of (17.77) with hats on) 
becomes 


A Liane x te Teac 
Bao 5 Ou hh Su he + 50,00"0 + u*/A 
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n $2 

+ ^o ôord + 579,00") - Zoh? = d (17.85) 
Equation (17.85) is very important. First of all, the first line shows that the 
particle spectrum in the 'spontaneously broken' case is dramatically different 
from that in the normal case: instead of two degrees of freedom with the same 
mass u, one (the 0-mode) is massless, and the other (the h-mode) has a mass 
of V2u. We expect the vacuum |0)g to be annihilated by the mode operators 

Gp, and Gg for these fields. This implies, however, that 


a(0|0)s = v/v2 (17.86) 


which is consistent with our interpretation of the vacuum expectation value 
(vev) as the classical minimum, and with the occurrence of massless modes. 
(The constant term in (17.85), which does not affect equations of motion, 
merely reflects the fact that the minimum value of Vsg is —4*/A.) The 
ansatz (17.84) and the non-zero vev (17.86) may be compared with (17.37) 
and (17.52), respectively, in the superfluid case. 

Secondly, the second line of equation (17.85) shows that only the derivative 
of the 6 field appears in the interaction terms, whereas this is not true of the h 
field. Indeed, the Lagrangian for the 0-mode cannot have any dependence on 
a constant value of 6, since this could be transformed away by a global U(1) 
transformation (17.72), which is a symmetry of the theory, and under which 
ĝ — Ê+ va. This will be an important point to remember when we consider 
effective Lagrangians for Goldstone modes in section 18.3. 

Goldstone’s model, then, contains much of the essence of spontaneous 
symmetry breaking in field theory: a non-zero vacuum value of a field which 
is not an invariant under the symmetry group, zero mass bosons, and massive 
excitations in a direction in field space which is ‘orthogonal’ to the degenerate 
ground states. However, it has to be noted that the triggering mechanism for 
the symmetry breaking (u? — —?) has to be put in by hand, in contrast to 
the — admittedly approximate, but more ‘dynamical’ — Bogoliubov approach. 
The Goldstone model, in short, is essentially phenomenological. 

As in the case of the superfluid, we may perfectly well choose a vacuum 


corresponding to a classical ground state with non-zero 0, say 0 = —va. Then 
^ TES 
0, al@|0, a = e 1% _ 17.87 
B(0, a 90, o) p i (17.87) 
= e **g(0|ó|0)g, (17.88) 


as in (17.57). But we know (see (7.27) and (7.28)) that 
ei^ = ¢' 0,00 (17.89) 


where . 
U= e Ny, (17.90) 
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So (17.88) becomes 
5(0,0/ó/0, o) —5(0|0.007 '|0)p (17.91) 


and we may interpret Ü!|0)p as the ‘alternative vacuum’ |0,a)p (this ar- 
gument is, as usual, not valid in the infinite volume limit where Ng fails to 
exist). 

It is interesting to find out what happens to the symmetry current cor- 
responding to the invariance (17.72), in the ‘broken symmetry’ case. This 
current is given in (7.23) which we write again here in slightly different nota- 
tion: 


j5 = i(9! 0^9 — (0^9)! 9), (17.92) 


normal ordering being understood. Written in terms of the h and 6 of (17.84), 
J becomes 


jh = v0" + 2h9"Ó + h20"ÓJv. (17.93) 


The term involving just the single field Ó is very remarkable: it tells us that 
there is a non-zero matrix element of the form 


B (0155 (2)]8, p) = —ip"ve "* (17.94) 


where |0, p) stands for the state with one 0-quantum (Goldstone boson), with 
momentum p^. This is easily seen by writing the usual normal mode expansion 
for Ô, and using the standard bosonic commutation relations for àg(k), âl (K^). 
In words, (17.94) asserts that, when the symmetry is spontaneously broken, 
the symmetry current connects the vacuum to a state with one Goldstone 
quantum, with an amplitude which is proportional to the symmetry breaking 
vacuum expectation value v, and which vanishes as the 4-momentum goes to 
zero. The matrix element (17.94), with x — 0, is precisely of the type that was 
shown to be non-zero in the proof of the Goldstone theorem, after (17.67). 
Note also that (17.94) is consistent with 835 = 0 only if p? = 0, as is required 
for the massless 0. 

We are now ready to generalize the Abelian U(1) model to the (global) 
non-Abelian case. 


E: SSe 


17.6 Spontaneously broken global non-Abelian 
symmetry 
We can illustrate the essential features by considering a particular example, 


which in fact forms part of the Higgs sector of the Standard Model. We 
consider an SU(2) doublet, but this time not of fermions as in section 12.3, 
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A 1 : 
Q va (6s + iga) 

where the complex scalar field bt destroys positively charged particles and 
creates negatively charged ones, and the complex scalar field po destroys neu- 
tral particles and creates neutral antiparticles. As we shall see in a moment, 
the Lagrangian we shall use has an additional U(1) symmetry, so that the 
full symmetry is SU(2) x U(1). This U(1) symmetry leads to a conserved 
quantum number which we call y. We associate the physical charge Q with 
the eigenvalue t3 of the SU(2) generator fs, and with y, via 


Q = e(t3 + y/2) (17.96) 


so that y(t) = 1 = y(¢°). Thus ¢* and ¢° can be thought of as analogous 
to the hadronic iso-doublet (K+, K?). 
The Lagrangian we choose is a simple generalization of (17.69) and (17.77): 


but of bosons: 


Le = (0,0 )(0^9) + oto — Agi? (17.97) 


which has the ‘spontaneous symmetry breaking’ choice of sign for the param- 
eter u?. Plainly, for the ‘normal’ sign of u?, in which *--j/2 9! 9? is replaced by 
'-u? 9t $^, with u? positive in both cases, the free (A = 0) part would describe 
a complex doublet, with four degrees of freedom, each with the same mass p. 
Let us see what happens in the broken symmetry case. 

For the Lagrangian (17.97) with u? > 0, the minimum of the classical 
potential is at the point 


('¢) min = 2u2/A = v?/2. (17.98) 
As in the U(1) case, we interpret (17.98) as a condition on the vev of ọtọ, 
(0|dt lO) = v?/2. (17.99) 


Before proceeding we note that (17.97) is invariant under global SU(2) trans- 
formations . l : 
à — d! = exp(-ia - 7/2)ó (17.100) 


but also under a separate global U(1) transformation 
$ > 9! = exp(-ia)ó (17.101) 


where a is to be distinguished from a = (01,02,03). The symmetry is then 
referred to as SU(2) x U(1), which is the symmetry of the electroweak sector 
of the Standard Model, except that in that case it is a local symmetry. 

As before, in order to get a sensible particle spectrum we must expand the 
fields not about ¢ — 0 but about a point satisfying the stable ground state 
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(vacuum) condition (17.98). That is, we need to define ‘(0|¢|0)’ and expand 
about it, as in (17.84). In the present case, however, the situation is more 
complicated than (17.84) since the complex doublet (17.95) contains four real 
fields as indicated in (17.95), and (17.98) becomes 


(0192 + 63 + à + G30) = v?. (17.102) 


It is evident that we have a lot of freedom in choosing the (0|ó;|0) so that 
(17.102) holds, and it is not at first obvious what an appropriate generalization 
of (17.84) and (17.85) might be. 

Furthermore, in this more complicated (non-Abelian) situation a qual- 
itatively new feature can arise: it may happen that the chosen condition 
(0|4;|0) Æ 0 is invariant under some subset of the allowed symmetry trans- 
formations. This would effectively mean that this particular choice of the 
vacuum state respected that subset of symmetries, which would therefore not 
be ‘spontaneously broken’ after all. Since each broken symmetry is associated 
with a massless Goldstone boson, we would then get fewer of these bosons 
than expected. Just this happens (by design) in the present case. 

Suppose, then, that we could choose the (0|¢;|0) so as to break this SU(2) 
x U(1) symmetry completely: we would then expect four massless fields. 
Actually, however, it is not possible to make such a choice. An analogy may 
make this point clearer. Suppose we were considering just SU(2), and the field 
‘® was an SU(2)-triplet, @. Then we could always write (0|@|0) = vn where 
n is a unit vector; but this form is invariant under rotations about the n-axis, 
irrespective of where that points. In the present case, by using the freedom 
of global SU(2) x U(1) phase changes, an arbitrary (0|¢|0) can be brought to 


the form 
(0/00) = ( "n ) (17.103) 


In considering what symmetries are respected or broken by (17.103), it is easi- 
est to look at infinitesimal transformations. It is then clear that the particular 
transformation 

6b = —ie(1 +73) (17.104) 


(which is a combination of (17.101) and the ‘third component’ of (17.100)) is 
still a symmetry of (17.103) since 


aem Los }=(0 ): (17.105) 


(0|9|0) = (0|¢ + d¢|0); (17.106) 


we say that ‘the vacuum is invariant under (17.104)’, and when we look at 
the spectrum of oscillations about that vacuum we expect to find only three 
massless bosons, not four. 


so that 
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Oscillations about (17.103) are conveniently parametrized by 


. P 0 
ġo = exp(—ið (x) - T/2v) | dy + Ala) ; (17.107) 


which is to be compared with (17.84). Inserting (17.107) into (17.97) (see 
problem 17.6) we easily find that no mass term is generated for the 0 fields, 
while the H field piece is 


A 1 ^ A A 
Êy = 30,H0" H — u? É? + interactions (17.108) 


just as in (17.85), showing that my = V2u. 

Let us now note carefully that whereas in the ‘normal symmetry’ case 
with the opposite sign for the u? term in (17.97), the free-particle spectrum 
consisted of a degenerate doublet of four degrees of freedom all with the same 
mass 4, in the ‘spontaneously broken’ case no such doublet structure is seen: 
instead, there is one massive scalar field, and three massless scalar fields. 
'The number of degrees of freedom is the same in each case, but the physical 
spectrum is completely different. 

In the application of this to the electroweak sector of the Standard Model, 
the SU(2) x U(1) symmetry will be ‘gauged’ (i.e. made local), which is easily 
done by replacing the ordinary derivatives in (17.97) by suitable covariant 
ones. We shall see in chapter 19 that the result, with the choice (17.107), 
will be to end up with three massive gauge fields (those mediating the weak 
interactions) and one massless gauge field (the photon). We may summarize 
this (anticipated) result by saying, then, that when a spontaneously broken 
non-Abelian symmetry is gauged, those gauge fields corresponding to symme- 
tries that are broken by the choice of (0|¢|0) acquire a mass, while those that 
correspond to symmetries that are respected by (0|¢|0) do not. Exactly how 
this happens will be the subject of chapter 19. 

We end this chapter by considering a second important example of spon- 
taneous symmetry breaking in condensed matter physics, as a preliminary to 
our discussion of chiral symmetry breaking in the following chapter. 


[u——— ——— ————————————Ó—À—2— 


17.7 The BCS superconducting ground state 


We shall not attempt to provide a self-contained treatment of the Bardeen- 
Cooper-Schrieffer (1957) — or BCS - theory; rather, we wish simply to focus 
on one aspect of the theory, namely the occurrence of an energy gap separating 
the ground state from the lowest excited levels of the fermionic energy spec- 
trum. The existence of such a gap is a fundamental ingredient of the theory 
of superconductivity; in the following chapter we shall see how Nambu (1960) 
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interpreted a chiral symmetry breaking fermionic mass term as an analogous 
‘gap’. We emphasize at the outset that we shall here not treat electromagnetic 
interactions in the superconducting state, leaving that topic for chapter 19. 

Our discussion will deliberately have some similarity to that of section 
17.3.2. In the present case, of course, we shall be dealing with fermions — 
namely electrons — rather than the bosons of a superfluid. Nevertheless, we 
shall see that a similar kind of ‘condensation’ occurs in the superconductor 
too. Naturally, such a phenomenon can only occur for bosons. Thus an essen- 
tial element in the BCS theory is the identification of a mechanism whereby 
pairs of electrons become correlated, the behaviour of which may have some 
similarity to that of bosons. Now, direct Coulomb interaction between a pair 
of electrons is repulsive, and it remains so despite the screening that occurs 
in a solid. But the positively charged ions do provide sources of attraction 
for the electrons, and may be used as intermediaries (via ‘electron-phonon 
interactions’) to promote an effective attraction between electrons in certain 
circumstances. At this point we recall the characteristic feature of a weakly 
interacting gas of electrons at zero temperature: thanks to the Exclusion Prin- 
ciple, the electrons populate single particle energy levels up to some maximum 
energy Ep (the Fermi energy), whose value is fixed by the electron density. It 
turns out (see for example Kittel 1987, chapter 8) that electron-electron scat- 
tering, mediated by phonon exchange, leads to an effective attraction between 
two electrons whose energies c; lie in a thin band Ep — wp < ey < Ep + wp 
around Ep, where wp is the Debye frequency associated with lattice vibra- 
tions. Cooper (1956) was the first to observe that the Fermi ‘sea’ was unstable 
with respect to the formation of bound pairs, in the presence of an attractive 
interaction. What this means is that the energy of the system can be lowered 
by exciting a pair of electrons above Er, which then become bound to a state 
with a total energy less than 2Ep. This instability modifies the Fermi sea in a 
fundamental way: a sort of ‘condensate’ of pairs is created around the Fermi 
energy, and we need a many-body formalism to handle the situation. 

For simplicity we shall consider pairs of equal and opposite momentum 
k, so their total momentum is zero. It can also be argued that the effective 
attraction will be greater when the spins are antiparallel, but the spin will 
not be indicated explicitly in what follows: ‘k’ will stand for ‘k with spin up’, 
and ‘—k’ for ‘—k with spin down’. With this by way of motivation, we thus 
arrive at the BCS reduced Hamiltonian 


Anos = Y eri 6g V >> 01,8 L6 nee (17.109) 
k k,k’ 


which is the starting point of our discussion. In (17.109), the &s are fermionic 
operators obeying the usual anticommutation relations, and the ground state 
is such that ¢,|0) = 0. The sum is over states lying near Ep, as above, and 
the single particle energies e; are measured relative to Ep. The constant V 
(with the minus sign in front) represents a simplified form of the effective 
electron-electron attraction. Note that, in the non-interacting (V — 0) part, 
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NA is the number operator for the electrons, which because of the Pauli 
Principle has eigenvalues 0 or 1; this term is of course completely analogous 
to (7.55), and sums the single particle energies e; for each occupied level. 
We immediately note that /Tpos is invariant under the global U(1) trans- 
formation 
êk — 6, = ep (17.110) 


for all k, which is equivalent to 7’ (a) = e- i^i (z) for the electron field operator 
at x. Thus fermion number is conserved by Hgcs. However, just as for 
the superfluid, we shall see that the BCS ground state does not respect the 
symmetry. 

We follow Bogoliubov (1958) and Bogoliubov et al. (1959) (see also Valatin 
1958), and make a canonical transformation on the operators 6, at k similar 


to the one employed for the superfluid problem in (17.38), as motivated by 
the ‘pair condensate’ picture. We set 


Êk = ukêk = opel ps Bi, = uk), um UkÓ f 
B = wué_ptrrd,, Bl, =unelp+und, (17.111) 


where ux and v are real, depend only on k = |k|, and are chosen so as to 
preserve anticommutation relations for the £s. This last condition implies 
(problem 17.7) 

u +o =1 (17.112) 


so that we may conveniently set 
Uuk = cos Oz, Uk = Sin Ôk. (17.113) 


Just as in the superfluid case, the transformations (17.111) only make sense in 
the context of a number non-conserving ground state, since they do not respect 
the symmetry (17.110). Although Hgcg of (17.109) is number conserving, we 
shall shortly make a crucial number non-conserving approximation. 

We seek a diagonalization of (17.109), analogous to (17.40), in terms of 
the mode operators f and B: 


Hacs = X we (Bp Bx as 8 LB kg) PY (17.114) 
k 


It is easy to check (problem 17.8) that the form (17.114) implies 
[A scs. 5j] = wi} (17.115) 


as in (17.41), despite the fact that the operators obey anticommutation rela- 
tions. Equation (17.115) then implies that the wg are the energies of states 
created by the quasiparticle operators Bi, and Bt k the ground state being 
defined by " , 

Dy, ground) pcs = 6_},|ground)Bcs = 0. (17.116) 
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Substituting for Bj in (17.115) from (17.111) we therefore require 


[Z ncs, COS 0, à — gin 0, é 1] = W (cos 0, & — gin 0, & qs (17.117) 
which must hold as an identity in the 6j's and ¢)’s. Evaluating (17.117) one 
obtains (problem 17.9) 


(wi — €i) cos 0; — V sin 6j Lk 6 kÓk =0 (17.118) 
-V cos 61 Y^, 6,6! p + (wi + e) sin; = 0. (17.119) 


It is at this point that we make the crucial ‘condensate’ assumption: we 
replace the operator expressions `k 6 L6 and X` k ere k by their average 
values, which are assumed to be non-zero in the ground state. Since these 
operators carry fermion number +2, it is clear that this assumption is only 
valid if the ground state does not, in fact, have a definitive number of particles 
— just as in the superfluid case. We accordingly make the replacements 


Vk ĉ_kêk — V pcs(ground| Lk ĉ_kêklground) Bcs =A (17.120) 
Vk eec — V pos (ground| X k ĉl ĉl , |ground) gos = A*(17.121) 


In that case, equations (17.118) and (17.119) become 


wi cos@; = «cos; + A sin Ój (17.122) 
wjsinOj = —e sin; + A* cos 6; (17.123) 


which are consistent if 


wy = [e + |AP]U?. (17.124) 


Equation (17.124) is the fundamental result at this stage. Recalling that e; 
is measured relative to Ep, we see that it implies that all excited states are 
separated from Er by a finite amount, namely |A]. 

In interpreting (17.124) we must however be careful to reckon energies for 
an excited state as relative to a BCS state having the same number of pairs, if 
we consider experimental probes which do not inject or remove electrons. Thus 
relative to a component of |ground)pgcs with N pairs, we may consider the 
excitation of two particles above a BCS state with N — 1 pairs. The minimum 
energy for this to be possible is 2|A]|. It is this quantity which is usually called 
the energy gap. Such an excited state is represented by 6i pt kl ground) gcs. 

We shall need the expressions for cos 0; and sin 0; which may be obtained 
as follows. Squaring (17.122), and taking A now to be real and equal to |A], 
we obtain 

|A|?(cos? 0; — sin? 6j) = 2e;|A| cos 0; sin 6j, (17.125) 


which leads to 
tan 20; = |A|/ei (17.126) 
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and then 


1 1/2 1 1/2 
cos 0; = ls (: + =) , sind, = ls (1 — 2) ; (17.127) 
UI wI 


All our experience to date indicates that the choice ‘A = real’ amounts to a 
choice of phase for the ground state value: 


V pcs(ground| 5 ĉ_kceklground)gcs = [A]. (17.128) 
k 


By making use of the U(1) symmetry (17.110), other phases for A are equally 
possible. 

The condition (17.128) has, of course, the by now anticipated form for a 
spontaneously broken U(1) symmetry, and we must therefore expect the oc- 
currence of a massless mode (which we do not demonstrate here). However, 
we may now recall that the electrons are charged, so that when electromag- 
netic interactions are included in the superconducting state, we have to allow 
the o in (17.110) to become a local function of zx. At the same time, the 
massless photon field will enter. Remarkably, we shall learn in chapter 19 
that the expected massless (Goldstone) mode is, in this case, not observed: 
instead, that degree of freedom is incorporated into the gauge field, rendering 
it massive. As we shall see, this is the physics of the Meissner effect in a 
superconductor, and that of the ‘Higgs mechanism’ in the Standard Model. 
Thus in the (charged) BCS model, both a fermion mass and a gauge boson 
mass are dynamically generated. 

An explicit formula for A can be found by using the definition (17.120), 
together with the expression for &j, found by inverting (17.111): 


Cp, = cos Or By + sin Or B p (17.129) 
This gives, using (17.120) and (17.129), 


[^| = V pos(ground| S (cos 0B + sin 0k BL) 
k 


x (cos 0j. Êk + sin 0, 81.) ground) pcs 


= V gcs(ground| 5 cos 6; sin 0.8 B^ y |ground)nos; 
k 
[A] 
=. y 2A 
ETE 


(17.130) 


The sum in (17.130) is only over the small band Ep — wp < ex, < Er + wp 
over which the effective electron-electron attraction operates. Replacing the 
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sum by an integral, we obtain the gap equation 


1 TD d 
Li x / MUR Je 
2 -wp |€? + |A|?]8 


= VNrsinh !(up/|A|) (17.131) 


where NF is the density of states at the Fermi level. Equation (17.131) yields 


WD 


—— —DB — m3wpe UVNr 17.132 
sinh(1/VNg) Pe ( ) 


|A| = 
for VNp « 1. This is the celebrated BCS solution for the gap parameter 
|A|. Perhaps the most significant thing to note about it, for our purpose, is 
that the expression for |A| is not an analytic function of the dimensionless 
interaction parameter V Np (it cannot be expanded as a power series in this 
quantity), and so no perturbative treatment starting from a normal ground 
state could reach this result. The estimate (17.132) is in reasonably good 
agreement with experiment, and may be refined. 

'The explicit form of the ground state in this model can be found by a 
method similar to the one indicated in section 17.3.2 for the superfluid. Since 
the transformation from the ¢’s to the Bs is canonical, there must exist a 
unitary operator which effects it via (compare (17.48)) 


Uses êp Übos = Bk. Üncse! p Ubos = Êt p (17.133) 


The operator Ûgcs is (Blatt 1964 section V.4, Yosida 1958, and compare 
problem 17.4) 


Üscs = | [ expl6 (6,8 , — êkê_k)l. (17.134) 
k 


Then, since €,|0) = 0, we have 
ÜlcosBj Üncs|0) = 0 (17.135) 
showing that we may identify 
ground) pcs = Üpcs|0) (17.136) 


via the condition (17.116). When the exponential in Üpcs is expanded out, 
and applied to the vacuum state |0), great simplifications occur. Consider the 
operator 

êp = 6 ê y- Ope Re (17.137) 


We have 
8 = ela epe kp kE kii g (17.138) 


Problems 225 


so that 57,0) = —|0). It follows that 


2 0 
exp(01.8,,)|0) = (1+ Okêk — a — 3k ..)]0) 
= (cosO; + sin 0k $1,)|0) 
= (cos, + sin ðr 2,2" 1)]0) (17.139) 
and hence 
[ground)pcs = los Ok + sin Ok & & O (17.140) 


k 


As for the superfluid, (17.140) represents a coherent superposition of corre- 
lated pairs, with no restraint on the particle number. 

We should emphasize that the above is only the barest outline of a simple 
version of BCS theory, with no electromagnetic interactions, from which many 
subtleties have been omitted. Consider, for example, the binding energy Ey 
of a pair, to calculate which one needs to evaluate the constant y in (17.114). 
To a good approximation one finds (see for example Enz 1992) Ey z 3A? / Eg. 
One can also calculate the approximate spatial extension of a pair, which is 
denoted by the coherence length € and is of order vp/7A where kp = mvp 
is the Fermi momentum. If we compare Ey to the Coulomb repulsion at a 
distance € we find 


Ep/(a/§) ~ ao/€ (17.141) 


where ag is the Bohr radius. Numerical values show that the right-hand side 
of (17.141), in conventional superconductors, is of order 1073. Hence the pairs 
are not really bound, only correlated, and as many as 10 pairs may have their 
centres of mass within one coherence length of each other. Nevertheless, the 
simple theory presented here contains the essential features which underlie all 
attempts to understand the dynamical occurrence of spontaneous symmetry 
breaking in fermionic systems. 
We now proceed to an important application in particle physics. 


a I I 
Problems 
17.1 Verify (17.29). 
17.2 Verify (17.35). 
17.3 Derive (17.43) and (17.44). 
17.4 Let . 1 
Ux exp[50(@ — al?) 


where [à, à!] = 1 and A, 0 are real parameters. 
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(a) Show that Uy is unitary. 


(b) Let : Ass A : : 
D -0,80,!, and  J,— O80, . 
Show that " 
dy. nous 
dA ^ 
and that - 
dÊ us 
qu 6d. 


(c) Hence show that 
Ty = cosh(A0) à + sinh(A0) ât, 
and thus finally (compare (17.38) and (17.48)) that 
U,4U;+ = cosh0 â + sinh 0 à! = â 
and 
Û â ÛT! = sinh 8 à + cosh 0 à! = at, 


where 1 
020,4- exp[;6 (à? — à"). 


17.5 Insert the ansatz (17.84) for ¢ into Lg of (17.69), with V = Vsp of 
(17.77), and show that the result for the constant term, and the quadratic 
terms in h and 0, is as given in (17.85). 


17.6 Verify that when (17.107) is inserted in (17.97), the terms quadratic in 
the fields H and 0 reveal that 0 is a massless field, while the quanta of the H 
field have mass V2p. 


17.7 Verify that the 6’s of (17.111) satisfy the required anticommutation 
relations if (17.112) holds. 


17.8 Verify (17.115). 
17.9 Derive (17.118) and (17.119). 
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Chiral Symmetry Breaking 


In section 12.4.2 we arrived at a puzzle: there seemed good reason to think 
that a world consisting of u and d quarks and their antiparticles, interacting 
via the colour gauge fields of QCD, should exhibit signs of the non-Abelian 
chiral symmetry SU(2)¢5, which was exact in the massless limit mu, ma — 0. 
But, as we showed, one of the simplest consequences of such a symmetry 
should be the existence of nucleon parity doublets, which are not observed. 
We can now resolve this puzzle by making the hypothesis (section 18.1) first 
articulated by Nambu (1960) and Nambu and Jona-Lasinio (1961a), that this 
chiral symmetry is spontaneously broken as a dynamical effect — presumably, 
from today’s perspective, as a property of the QCD interactions, as discussed 
in section 18.1.1. If this is so, an immediate physical consequence should be 
the appearance of massless (Goldstone) bosons, one for every symmetry not 
respected by the vacuum. Indeed, returning to (12.168) which we repeat here 
for convenience, 


a(i - 
TP |d) = lä), (18.1) 


we now interpret the state |ù) (which is degenerate with |d)) as |d + ‘1*’) 
where ‘r’ is a massless particle of positive charge, but a pseudoscalar (07) 
rather than a scalar (0*) since, as we saw, |i) has opposite parity to |u). In 


the same way, ‘m~’ and ‘7°’ will be associated with $0 and $0. Of course, 
no such massless pseudoscalar particles are observed: but it is natural to hope 
that when the small up and down quark masses are included, the real pions 
(nt, n,n?) will emerge as ‘anomalously light’, rather than strictly massless. 
This is indeed how they do appear, particularly with respect to the octet of 
mesons, which differ only in qq spin alignment from the 07 octet. As Nambu 
and Jona-Lasinio (1961a) said, ‘it is perhaps not a coincidence that there 
exists such an entity [i.e. the Goldstone state(s)] in the form of the pion’. 

If this was the only observable consequence of spontaneously breaking chi- 
ral symmetry, it would perhaps hardly be sufficient grounds for accepting 
the hypothesis. But there are two circumstances which greatly increase the 
phenomenological implications of the idea. First, the vector and axial vec- 


1 1 
tor symmetry currents po and T0 of the u-d strong interaction SU(2) 
symmetries (see (12.109) and (12.165)) happen to be the very same currents 
which enter into strangeness-conserving semileptonic weak interactions (such 
as n pe De and qc — pu vj), as we shall see in chapter 20. Thus some re- 


markable connections between weak- and strong-interaction parameters can be 
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established, such as the Goldberger—Treiman (1958) relation (see section 18.2) 
and the Adler—Weisberger (Adler 1965, Weisberger 1965) relation. Second, it 
turns out that the dynamics of the Goldstone modes, and their interactions 
with other hadrons such as nucleons, are strongly constrained by the under- 
lying chiral symmetry of QCD; indeed, surprisingly detailed effective theories 
(see section 18.3) have been developed, which provide a very successful de- 
scription of the low energy dynamics of the Goldstone degrees of freedom. 
Finally we shall introduce the subject of chiral anomalies in section 18.4. 

It would take us too far from our main focus on gauge theories to pursue 
these interesting avenues in any detail. But we hope to convince the reader, in 
this chapter, that chiral symmetry breaking is an integral part of the Standard 
Model, being a fundamental property of QCD. 


E DN DDUUUUUURUÜRÜUT 


18.1 The Nambu analogy 


We recall from section 12.4.2 that for ‘almost massless’ fermions it is natural 
to use the representation (3.40) for the Dirac matrices, in terms of which the 
Dirac equation reads 


Ed = oco:pó-cmx (18.2) 
Ex = -o-px+me¢. (18.3) 


Nambu (1960) and Nambu and Jona-Lasinio (1961a) pointed out a remarkable 
analogy between (18.2) and (18.3) and equations (17.122) and (17.123) which 
describe the elementary excitations in a superconductor (in the case A is real), 
and which we repeat here for convenience: 


wi cos@; = «cos; + A sin Ój (18.4) 
wi sinf; = —e, sind; + A cosÓj. (18.5) 


In (18.4) and (18.5), cos; and sin 6; are respectively the components of the 
electron destruction operator ¢ and the electron creation operator e. I in the 


quasiparticle operator Êi (see (17.111)): 
Êi = cos 0; 6l — sin 0; M (18.6) 


'The superposition in Êi combines operators which transform differently under 
the U(1) (number) symmetry. The result of this spontaneous breaking of the 
U(1) symmetry is the creation of the gap A (or 2A for a number-conserving 
excitation), and the appearance of a massless mode. If A vanishes, (17.126) 
implies that 0; = 0, and we revert to the symmetry-respecting operators 
él, ey. Consider now (18.2) and (18.3). Here ¢ and x are the components of 
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The type of fermion-antifermion in the ‘Nambu chiral condensate’. 


definite chirality in the Dirac spinor w (compare (12.149)), which is itself not 
a chirality eigenstate when m # 0. When m vanishes, the Dirac equation for 


w decouples into two separate ones for the chirality eigenstates dg = ( ) 


and ¢, = ( s | Nambu therefore made the following analogy: 


Superconducting gap parameter A Dirac mass m 


iura 
quasiparticle excitation «€» massive Dirac particle 
o 


U(1) number symmetry U(1)s chirality symmetry 


Goldstone mode +» massless boson. 


In short, the mass of a Dirac particle arises from the (presumed) spontaneous 
breaking of a chiral (or y5) symmetry, and this will be accompanied by a 
massless boson. 

Before proceeding we should note that there are features of the analogy, 
on both sides, which need qualification. First, the particle symmetry we want 
to interpret this way is SU(2)¢5 not U(1)5, so the appropriate generalization 
(Nambu and Jona-Lasinio 1961b) has to be understood. Second, we must 
again note that the BCS electrons are charged, so that in the real supercon- 
ducting case we are dealing with a spontaneously broken local U(1) symmetry, 
not a global one. By contrast, the SU(2);5 chiral symmetry is not gauged. 

As usual, the quantum field theory vacuum is analogous to the many- 
body ground state. According to Nambu’s analogy, therefore, the vacuum 
for a massive Dirac particle is to be pictured as a condensate of correlated 
pairs of massive fermions. Since the vacuum carries neither linear nor angular 
momentum, the members of a pair must have equal and opposite spin: they 
therefore have the same helicity. However, since the vacuum does not violate 
fermion number conservation, one has to be a fermion and the other an an- 
tifermion. This means (recalling the discussion after (12.147)) that they have 
opposite chirality. Thus a typical pair in the Nambu vacuum is as shown in 
figure 18.1. We may easily write down an expression for the Nambu vacuum, 
analogous to (17.140) for the BCS ground state. Consider solutions $4. and 
X+ of positive helicity in (18.2) and (18.3); then 


Edo, = |plo+ t mx. (18.7) 
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Ex+ = —I|plx+ +m. (18.8) 


Comparing (18.7) and (18.8) with (18.4) and (18.5), we can read off the mixing 
coefficients cos 0, and sin 6, as (cf (17.127)) 


ae E (1+ IB" (18.9) 


sind, = E (: - ly)" (18.10) 


where E = (m? + p?)!/?. The Nambu vacuum is then given by! 


lo) = [[ (cos, — sin 656 (p)dl(—p))|0) m=0, (18.11) 
Dp. 


where &'s and dl's are the operators in massless Dirac fields. Depending on 
the sign of the helicity s, each pair in (18.11) carries +2 units of chirality. We 
may check this by noting that in the mode expansion of the Dirac field V, 
&s(p) operators go with u-spinors for which the ys eigenvalue equals the helic- 
ity, while di(—p) operators accompany v-spinors for which the y5 eigenvalue 
equals minus the helicity. Thus under a chiral transformation i = emib), 
ĉs — e886, and dt — ei?*d!, for a given s. Hence étdt acquires a factor 
efs. Thus the Nambu vacuum does not have a definite chirality, and oper- 
ators carrying non-zero chirality can have non-vanishing vacuum expectation 


values. A mass term oo is of just this kind, since under w = eB 15a) we find 
iyo  ptelo% ei 154) = ye 24%). Thus, in analogy with (17.120), a 
Dirac mass is associated with a non-zero value for y(0|W~|0)n. 

In the original conception by Nambu and co-workers, the fermion under 
discussion was taken to be the nucleon, with ‘m’ the (spontaneously gener- 
ated) nucleon mass. The fermion—fermion interaction — necessarily invariant 
under chiral transformations — was taken to be of the four-fermion type. As 
we have seen in volume 1, this is actually a non-renormalizable theory, but a 
physical cut-off was employed, somewhat analogous to the Fermi energy Er. 
'Thus the nucleon mass could not be dynamically predicted, unlike the anal- 
ogous gap parameter A in BCS theory. Nevertheless, a gap equation similar 
to (17.131) could be formulated, and it was possible to show that when it 
had a non-trivial solution, a massless bound state automatically appeared in 
the ff channel (Nambu and Jona-Lasinio 1961a). This work was generalized 
to the SU(2)¢5 case by Nambu and Jona-Lasinio (1961b), who showed that 
if the chiral symmetry was broken explicitly by the introduction of a small 
nucleon mass (~ 5 MeV), then the Goldstone pions would have their observed 
non-zero (but small) mass. In addition, the Goldberger—Treiman (1958) re- 
lation was derived, and a number of other applications were suggested. Sub- 
sequently, Nambu with other collaborators (Nambu and Lurie 1962, Nambu 


1A different phase convention is used for d] (—p) as compared to that for QE in (17.111). 
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and Schrauner 1962) showed how the amplitudes for the emission of a single 
‘soft’ (nearly massless, low momentum) pion could be calculated, for various 
processes. These developments culminated in the Adler-Weisberger relation 
(Adler 1965, Weisberger 1965) which involves two soft pions. 

This work was all done in the absence of an agreed theory of the strong 
interactions (the NJ-L theory was an illustrative working model of dynami- 
cally generated spontaneous symmetry breaking, but not a complete theory 
of strong interactions). QCD became widely accepted as that theory around 
1973. In this case, of course, the ‘fermions in question’ are quarks, and the 
interactions between them are gluon exchanges, which conserve chirality as 
noted in section 12.4.2. The bulk of the masses of the qqq bound states which 
form baryons is then interpreted as being spontaneously generated, while a 
small explicit quark mass term in the Lagrangian is responsible for the non- 
zero pion mass. Let us therefore now turn to two-flavour QCD. 


18.1.1 Two flavour QCD and SU(2);i;, x SU(2)tn 
Let us begin with the massless case, for which the fermionic part of the La- 
grangian is 7 

Ly = üifü + difd (18.12) 
where û and d now stand for the field operators, 


D" = 8" ig, A/2- A", (18.13) 


and the A matrices act on the colour (r,b,g) degree of freedom of the u and d 
quarks. This Lagrangian is invariant under 


(i) U(1)¢ ‘quark number’ transformations 
à e$; (18.14) 
(ii) SU(2)¢ ‘flavour isospin’ transformations 
å — exp(—ia - 7/2) j; (18.15) 
(ii) U(1)rs ‘axial quark number’ transformations 
ĝ — e sg. (18.16) 
(iv) SU(2)¢5 ‘axial flavour isospin’ transformations 
å — exp(—iG - 7/245) d, (18.17) 


where 


E ) ; (18.18) 
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Symmetry (i) is unbroken, and its associated ‘charge’ operator (the quark 
number operator) commutes with all other symmetry operators, so it need 
not concern us further. Symmetry (ii) is the standard isospin symmetry of 
chapter 12, explicitly broken by the electromagnetic interactions (and by the 
difference in the masses m, and ma, when included). Symmetry (iii) does 
not correspond to any known conservation law; on the other hand, there are 
not any near-massless isoscalar 0~ mesons, either, such as must be present 
if the symmetry is spontaneously broken. The 7 meson is an isoscalar 07 
meson, but with a mass of 547 MeV it is considerably heavier than the pion. 
In fact, it can be understood as one of the Goldstone bosons associated with 
the spontaneous breaking of the larger group SU(3)rs, which includes the s 
quark (see section 18.3.3). In that case, the symmetry (iii) becomes extended 
to 

à — eg, d+ eid, 8 — ens, (18.19) 


but there is still a missing light isoscalar 0" meson. It can be shown that 
its mass must be less than or equal to v3 m4 (Weinberg 1975), but no such 
particle exists. This is the famous ‘U(1) problem’: it was resolved by ’t 
Hooft (1976a, 1986), by showing that the inclusion of instanton configurations 
(Belavin et al. 1975) in path integrals leads to violations of symmetry (iii) — 
see, for example, Weinberg (1996) section 23.5. Finally, symmetry (iv) is the 
one with which we are presently concerned. 

The symmetry currents associated with (iv) are those already given in 
(12.165), but we give them again here in a slightly different notation which 
will be similar to the one used for weak interactions: 


its = ayy i=1,2,3 (18.20) 
Similarly the currents associated with (ii) are 
i A uli ^ . 
=O" 34 i = 1,2,3. (18.21) 
The corresponding ‘charges’ are (compare (12.166)) 

Qis = Ex - n 357, dd? x, (18.22) 

(2 

previously denoted by p and (compare (12.101)), 


Dus Ji aa, (18.23) 


na 
previously denoted by TO). As with all symmetries, it is interesting to dis- 
cover the algebra of the generators, which are the six charges Q;, Qi,5 in this 
case. Patient work with the anticommutation relations for the operators in 


18.1. The Nambu analogy 233 


q(x) and q'(x) gives the results (problem 18.1) 


[Qu Q;] = ieigjnQe (18.24) 
[Qi Qs] = iejnQz,s (18.25) 
[Qis Âj] = iegkQs. (18.26) 


Relation (18.24) has been seen before in (12.101), and simply says that the 
Q;'s obey a SU(2) algebra. A simple trick reduces the rather complicated 
algebra of (18.24)-(18.26) to something much simpler. Defining 


= 


Gin =5(Q:+O.s) ÂÂ- Â) (08.27) 
we find (problem 18.2) 
(Qin. Qj] = iejkQxrn (18.28) 
[Qiu Qr] = i€ijkQk,L (18.29) 
[Qin.Q;i] = 0. (18.30) 


The operators Qin. Qin therefore behave like two commuting (independent) 
angular momentum operators, each obeying the algebra of SU(2). For this 
reason, the symmetry group of the combined symmetries (ii) and (iv) is called 
SU(2)fL x SU(2)er- 

The decoupling effected by (18.27) has a simple interpretation. Referring 
to (18.22) and (18.23), we see that 


Qin = fa (=) Tida (18.31) 


and similarly for Q; i. But ((1 + ys5)/2) are just the projection operators PR,L 
introduced in section 12.3.2, which project out the chiral parts of any fermion 
field. Furthermore, it is easy to see that P2 = Pr and P? = P, so that Qin 
and Qi. can also be written as 


A at Tia A at Ti » 
Qai - [himd Qi = fata d?z, (18.32) 


where Gp = ((1 + ys)/2)ĝ, du = ((1 — y5)/2)á. In a similar way, the currents 
(18.20) and (18.21) can be written as 


je = jig +J Jis Ege (18.33) 
where ae em 
jin = On iR Jf = Guo à. (18.34) 


Thus the SU(2) and SU(2)n refer to the two chiral components of the fermion 
fields, which is why it is called chiral symmetry. 
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Under infinitesimal SU(2) isospin and axial isospin transformations, d 
transforms by 
dd =(1—ie-7/2—in- 7/245). (18.35) 


This can be rewritten in terms of jg and gp, using 


ĝ= QR t du, dR — dm, YL = —dr. (18.36) 

We find that 
dr = (1 — ie +n) - 7/2)dn (18.37) 

and similarly 
4, = (1—i(e — n) - 7/2)àr. (18.38) 


Hence jg and qr, transform quite independently”, which is why [Qir, Q;1] = 
0. 

This formalism allows us to see immediately why (18.12) is chirally invari- 
ant: problem 18.3 verifies that £q can be written as 


La = Gri Dar + Hi Di (18.39) 


which is plainly invariant under (18.37) and (18.38), since D is flavour-blind. 

There is as yet no formal proof that this SU(2),xSU(2)p chiral sym- 
metry is spontaneously broken in QCD, though it can be argued that the 
larger symmetry SU(3)r, x SU(3)n - appropriate to three massless flavours — 
must be spontaneously broken (see Weinberg 1996, section 22.5). This is, of 
course, an issue that cannot be settled within perturbation theory (compare 
the comments after (17.132)). Numerical solutions of QCD on a lattice (see 
chapter 16) do provide strong evidence that baryons acquire large dynamical 
(SU(2)¢5-breaking) mass. 

Even granted that chiral symmetry is spontaneously broken in massless 
two-flavour QCD, how do we know that it breaks in such a way as to leave 
the isospin (‘R + L’) symmetry unbroken? A plausible answer can be given 
if we restore the quark mass terms via 


; - TEE TAE - 
Lm = —My tit — madd = =z + ma)âĝĝ — 5 (mu — ma)drad. 18.40) 
Now " 7 F 
qd = dy dn + Gp 18.41) 
and k ] g 
9739 = ÎLT34R + (dg T3ÓL. 18.42) 


Including these extra terms is somewhat analogous to switching on an external 
field in the ferromagnetic problem, which determines a preferred direction for 
the symmetry breaking. It is clear that neither of (18.41) and (18.42) preserves 


2We may set y= € + n, and ô = e€— n. 
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SU(2), xSU(2)n since they treat the L and R parts differently. Indeed from 
(18.37) and (18.38) we find 


Gdn > Ldk = dL (1 i(e — m): 7/2) — ile +) - 7/2)àn (18.43) 
= ĝLÂR— im: ÂLTÂR (18.44) 

and 
Gp. > dnd. t in: dgT(L. (18.45) 


Equations (18.44) and (18.45) confirm that the term q@ in (18.40) is invariant 
under the isospin part of SU(2), xSU(2)n (since e is not involved), but not 
invariant under the axial isospin transformations parametrized by 7. The 
dT3Q term explicitly breaks the third component of isospin (resembling an 
electromagnetic effect), but its magnitude may be expected to be smaller 
than that of the Gg term, being proportional to the difference of the masses, 
rather than their sum. This suggests that the vacuum will ‘align’ in such a 
way as to preserve isospin, but break axial isospin. 


E 


18.2 Pion decay and the Goldberger—Treiman relation 


We now discuss some of the rather surprising phenomenological implications of 
spontaneously broken chiral symmetry — specifically, the spontaneous break- 
ing of the axial isospin symmetry. We start by ignoring any ‘explicit’ quark 
masses, so that the axial isospin current is conserved, NA 5 — 0. From sections 
17.4 and 17.5 (suitably generalized) we know that this current has non-zero 
matrix elements between the vacuum and a ‘Goldstone’ state, which in our 
case is the pion. We therefore set (cf (17.94)) 


(013. (x)|mj, p) = ip" fre? sij (18.46) 


where fr is a constant with dimensions of mass, and which we expect to be 
related to a symmetry breaking vev. This is just what we shall find in section 
18.3.1. Note that (18.46) is consistent with 0,31. = 0 if p? = 0, i.e. if the 
pion is massless. 

We treat fr as a phenomenological parameter. Its value can be determined 
from the rate for the decay 7* — pty, by the following reasoning. In chapter 
20 we shall learn that the effective weak Hamiltonian density for this low 
energy strangeness non-changing semileptonic transition is 


fiw(z) = Pe Vasto)" — rs) Bae 


x (by. (x) (1 — 15) 9e (2) + be, (x)7u(1—75)b,(a)] (18.47) 


w 
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where Gr is Fermi constant and Vaq is an element of the Cabibbo-Kobayashi- 
Maskawa (CKM) matrix (see section 20.7.3). Thus the lowest-order contribu- 
tion to the S-matrix is 


zudem I da fiw (x) |n, p) 
NEN A (apt op jj j 
= -izp d x(u p Vy; P2| Py, (z)v,(1 n 5) W(x) |0) 


x (Ola (a)"(1 — 75)bu(2)|r*,p). (18.48) 


The leptonic matrix element gives ü,(p»)y, (1 — *»5)v, (p1)e?:*?2)7, For the 
pionic one, we note that 


hala)" (1 — 55) (2) = 3f (a) — EG) — Fs) + Es Gr) (18.49) 


from (18.20) and (18.21). Further, the currents j can have no matrix elements 
between the vacuum (which is a 0* state) and the m (which is 07), by the 
following argument. From Lorentz invariance such a matrix element has to 
be a 4-vector. But since the initial and final parities are different, it would 
have to be an axial 4-vector?. However, the only 4-vector available is the 
pion’s momentum p^ which is an ordinary (not an axial) 4-vector. On the 
other hand, precisely for this reason the axial currents a 5 do have a non-zero 


matrix element, as in (18.46). Noting that |1*) = —5|mi + i), we find that 


Il 


i ^ ee x 
-lis — iĝĵ2 5|m1 + ima) (18.50) 


—V2p" fe ir? (18.51) 


(Olax) — 2) (|^, p) 


I 


so that (18.48) becomes 


i(27)40* (px + pa — p)[GrVaaü, (p2)yu (1 — vs)v(pi)p" frl- (18.52) 


'The quantity in brackets is, therefore, the invariant amplitude for the process, 
M. Using p = pi + po, we may replace f in (18.52) by m,, neglecting the 
neutrino mass. 

Before proceeding, we comment on the physics of (18.52). The (1 — 45) 
factor acting on a v spinor selects out the ys = —1 eigenvalue which, if the 
muon was massless, would correspond to positive helicity for the u” (compare 
the discussion in section 12.4.2). Likewise, taking the (1 — ys) through the 
^y? 4" factor to act on ul, it selects the negative helicity neutrino state. Hence 
the configuration is as shown in figure 18.2, so that the leptons carry off a 
net spin angular momentum. But this is forbidden, since the pion spin is 
zero. Hence the amplitude vanishes for massless muons and neutrinos. Now 
the muon, at least, is not massless, and some ‘wrong’ helicity is present in 


3See chapter 4 of volume 1. 
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Helicities of massless leptons in 7+ — u*v, due to the ‘V-A’ interaction. 


its wavefunction, in an amount proportional to m,. This is why, as we have 
just remarked after (18.52), the amplitude is proportional to m,,. The rate is 
therefore proportional to m2. 'This is a very important conclusion, because it 
implies that the rate to muons is ~ (m,,/me)? ~ (400)? times greater than 
the rate to electrons — a result which agrees with experiment, while grossly 
contradicting the naive expectation that the rate with the larger energy release 
should dominate. This, in fact, is one of the main indications for the ‘vector- 
axial vector’, or ‘V-A’, structure of (18.47), as we shall see in more detail in 
section 20.2. 
Problem 18.4 shows that the rate computed from (18.52) is 


Gem; fz(mz — mz) 


= rt 2 
Era do |Vaa|*. (18.53) 
Including radiative corrections, the value 
fr = 92 MeV (18.54) 


can be extracted. 

Consider now another matrix element of gn 5, this time between nucleon 
states. Following an analysis similar to that in section 8.8 for the matrix 
elements of the electromagnetic current operator between nucleon states, we 
write 


(N, p' [it's (O)IN, p) 

cmd u 5(,2 ig" F2(¢2 Mme F5(g2)| Z 

= ü(p) |y" vs Fi (4) + sp wo F2 (4) +a" Fs (a)| sup). 
(18.55) 


where the F?'s are certain form factors, M is the nucleon mass, and q = p— p’. 
The spinors in (18.55) are understood to be written in flavour and Dirac space. 
Since (with massless quarks) j;", is conserved - that is q,,j/';(0) = 0 — we find 


0 = (p^) [^s FE GP) + ds FS (1 up) 
= alpi- PFE) + Pas F(a?) up) 


= ü(p)[-2MwsFj (q^) + P53 (Pulp), (18.56) 
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FIGURE 18.3 
One pion intermediate state contribution to F3. 


using pys = — 75 y and the Dirac equations for u(p), u(p'). Hence the form 
factors F? and F$ must satisfy 


2M Fi (q^) = P F3 (q?). (18.57) 


Now the matrix element (18.55) enters into neutron /-decay (as does the 
matrix element of 3" (0)). Here, q? ~ 0 and (18.57) appears to predict, there- 
fore, that either M = 0 (which is certainly not so) or F?(0) = 0. But F?(0) 
can be measured in 8 decay, and is found to be approximately equal to 1.26; it 
is conventionally called ga. The only possible conclusion is that F? must con- 
tain a part proportional to 1/q?. Such a contribution can only arise from the 
propagator of a massless particle — which, of course, is the pion. This elegant 
physical argument, first given by Nambu (1960), sheds a revealing new light 
on the phenomenon of spontaneous symmetry breaking: the existence of the 
massless particle coupled to the symmetry current is 5 ‘saves’ the conservation 
of the current. 

We calculate the pion contribution to F3 as follows. The process is pic- 
tured in figure 18.3. The pion-current matrix element is given by (18.46), 
and the (massless) propagator is i/g?. For the 7 — N vertex, the conventional 
Lagrangian is M 

igrunn TN ysT; N, (18.58) 


which is SU(2)¢-invariant and parity conserving since the pion field is a pseu- 
doscalar, and so is Nys N. Putting these pieces together, the contribution of 
figure 18.3 to the current matrix element is 


2g«NNü(p' ys, u(p) 3 (—ig" fr), (18.59) 


and so 1 
F3(¢) = Zann fr (18.60) 


from this contribution. Combining (18.57) with (18.60) we deduce 


ga = Bim Fit?) = SNF, (18.61) 
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the Goldberger-Treiman (1958) relation. Taking M = 939 MeV, ga = 1.26 
and f, = 92 MeV one obtains genn œ 12.9, which is only 5% below the 
experimental value of this effective pion-nucleon coupling constant. 

We can repeat the argument leading to the G-T relation but retaining 
m? #0. Equation (18.46) tells us that NS / (M2 fn) behaves like a properly 
normalized pion field, at least when operating on a near mass-shell pion state. 
This means that the one-nucleon matrix element of ORA s is (cf (18.59)) 


27 Ti i 2 
29nNNU(p 195 u(p) aa de (18.62) 


while from (18.55) it is given by 


Te Ti 
iü(p^)[-2M«s Ff (a7) + PFS (7) 5 uP). (18.63) 
Hence 
29nNNM2 fr 
5/2 275/42) z 
—2M FÈ (q4) +4 F} (q) = Ee (18.64) 


Also, in place of (18.60) we now have 
3 (q ) 2 m2 QaNN Ír- à 


Equations (18.64) and (18.65) are consistent for q? = m2 if 
FP (g? = m2) = Jann f«/ M. (18.66) 


FP (q?) varies only slowly from q? = 0 to q? = m2, since it contains no rapidly 
varying pion pole contribution, and so we recover the G-T relation again. 

Amplitudes involving two Goldstone pions can be calculated by an exten- 
sion of these techniques. However, a much more efficient method is available, 
through the use of effective Lagrangians, which capture the low energy dy- 
namics of the Goldstone modes. 


O 
18.3 Effective Lagrangians 

18.3.1 The linear and non-linear c-models 

We begin by considering the linear o-model, which has the same Lagrangian 


as the one considered in section 17.6, 


Lo = (OLP (0^9) + wold — FOO), (18.67) 
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but we shall interpret it differently here. The sign of the u? term has been 
chosen to induce spontaneous symmetry breaking. In section 17.6, o was the 
SU(2) doublet 


^ +. ($1 + ida) 
=f V2" g 
Pm ( ($s + iĝa) ) eo 


in terms of which (18.67) becomes 


1. > ^ 1555 ÀA a 5 
Lo = 39u6a0" Ga T 3! Gaba S 16 (0469). (18.69) 


where the sum on a = 1 to 4 is understood. Evidently (18.69) is invariant 
under transformations which preserve the 'dot product' baba; namely the 
transformations of SO(4). This group is discussed in appendix M, section 
M.4.3. We note there that the algebra of the generators of SO(4) is the same 
as that of SU(2) x SU(2), which is the algebra of the chiral charges in (18.28)- 
(18.30). This suggests that we should rewrite (18.69) in such a way as to reveal 
its SU(2)_xSU(2)z symmetry, rather than its O(4) symmetry. Three of the 
four fields will then be identified with the Goldstone bosons associated with 
the spontaneous breaking of the ‘R — L’ part; they will in turn be identified 
with the (massless) pions. 
One way to bring out the chiral symmetry of (18.69) is to write 


Ga tian) | 1 (2) 
-( Gad NM ; 18.70 
? ( (6 — itts)/V2 v2 1 ! 
where 2 
L=G+ir- en. 18.71) 
Then i 
ig = TSS), 18.72) 
and (18.69) becomes 
NES atau a d. rater. A ped 
fs= ric 9") + pui 3)- s UO Dye: (18.73) 


This Lagrangian is invariant under the SU(2) x SU(2)r transformation 
È > ULÈUŻ (18.74) 


where 
UL = exp(-iay- 7/2), Ur = exp(—iar - 7/2) (18.75) 


are two independent SU(2) transformations (remember that TrAB = TrBA). 
For the case of infinitesimal transformations, we find (problem 18.5) 


Q> 


> ó-m (18.76) 
> R+no+ExX TF, (18.77) 


>> 
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where 
n= (en eL )/2, € — (en + e,)/2. (18.78) 
Evidently eg = N + € and er, = € — n, which we may compare with the L and 


R transformation of the quark fields in (18.37), (18.38). 
With the sign of u? as in (18.73), the classical potential has a minimum at 


ô? +È? = Ag JA m v?, (18.79) 
which we interpret as the symmetry breaking condition 
(016? + &^|0) = v?. (18.80) 
Let us choose the particular ground state 
(0|G|0) =v, (0|z|0) = 0, (18.81) 


which is actually the same as (17.103). Referring back to (18.76) and (18.77) 
we see that this vacuum is invariant under ‘L + R’ transformations with pa- 
rameters e, but not under ‘L — R’ transformations with parameters 7. These 
correspond respectively to the SU(2); flavour isospin, and SU(2)¢5 axial flavour 
isospin, transformations on the quark fields. So this vacuum spontaneously 
breaks the axial isospin symmetry. Fluctuations away from this minimum are 
described by fields 7 and 3 = 6 — v. Placing this shift into (18.73) we find 
that Ês becomes Ê, where 

Ê; = 15,5098 — pg? + T Og — ^ ug(s? tH4?5)- A (ge $^, (18.82) 

2 2 4 16 

discarding an irrelevant constant. As expected, the field § is massive (with 
mass v2u), while the fields 7 are massless, and may be identified with the 
Goldstone modes associated with the spontaneous breaking of the axial isospin 
symmetry. 

The Lagrangian £, incorporates the correct symmetries, and can be used 
to calculate 7 — 7 scattering, for example (in the massless limit). But it is not 
the most efficient Lagrangian to use, as we can see from the following consid- 
erations. Consider the amplitude for «^ —7° scattering, in tree approximation 
(Donoghue et al. 1992). The contributing terms in £, are 


(&^)? — Coss, (18.83) 
which we can rewrite in terms of the charged and neutral fields as 

I ^ (95 S 20 242 A aRt A 202 18.84 
7-7 = — Fel AiL HA") = 7v RaT ER e (18.84) 
Then the terms responsible for 7+ — 7° scattering at tree level are 


1 
-Zat taa a= Zos (sis. pom ,) . (18.85) 
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The first of these represents a four-pion contact interaction with amplitude 
—iA/2, (18.86) 


while the second contributes an s-exchange graph in the t-channel with am- 
plitude 
i 
—idv /2)? ————— 18.87 
(0/2 (18.87) 
where q is the 4-momentum transfer q = p, — p+ = po — po. The sum of these 


1S 
2 


: q 


xS (18.88) 


which reduces to ig?/v? for q zz 0. Thus, despite the apparent constant 4- 
boson piece (18.86), the total amplitude in fact vanishes as q? — 0, due to a 
cancellation. 

This cancellation is not an accident. It is generally true that Goldstone 
fields enter only via their derivatives, which bring factors of momenta into the 
amplitudes. We drew attention to this following equation (17.85), and the 
same is true of the @ fields in (17.107). This suggests that it is both possible, 
and more efficient, to recast £, into a form in which only the derivatives of 
the Goldstone fields enter. Equation (17.107) indicates how to do this: we 
define new pion fields (but call them the same) by 


S=(v+S)U, Û = exp(ir - #/v), (18.89) 
where $ is invariant under SU(2),x SU(2)n, and where Ü transforms by 
Û —> ULÛU}. (18.90) 


Now XX = (v + $)?, and the Goldstone modes have been transformed away 
from the potential terms in £y, reappearing in the derivative terms instead. 
We write the transformed Lagrangian as £s where 


A 


UNE E > ee O ae 
Ls = 50,80" $ — p S? + 1U* 8)’ Tr(8 Ôa 1) — 1S AT 


$^, (18.91) 


where we have used 
Utara + HUI = 0, (18.92) 


which follows from the unitary condition ÜtÜ =1. 
When OU is expanded in powers of 7, we recover a kinetic energy piece 


sot OMe, (18.93) 


and all other terms involve derivatives of 7. In particular, the term with 
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the lowest number of derivatives which contributes to the m — 7 scattering 
amplitude is 

elt Of) (OM) — FO, 0" à], (18.94) 
since the $ — # — ft vertex already has two derivatives. The reader may 
check that the amplitude for x ^a? — «^? calculated from (18.94) is ig?/v?, 
exactly as before, but this time without having to go through the cancellation 
argument. 

The fields in X on the one hand, and in Ê and Ü on the other, are related 
non-linearly, but a physical amplitude calculated with either representation 
has turned out to be the same, in this simple case. It is in fact generally 
true that such non-linear field redefinitions lead to the same physics (Haag 
1958, Coleman, Wess and Zumino 1969, Callan, Coleman, Wess and Zumino 
1969). It is clearly advantageous to work with Ês, which builds in the desired 
derivatives of the Goldstone modes. 

Indeed, we can simplify matters even further. Since S is invariant under 
SU(2) x SU(2)n, the full symmetry of the Lagrangian is maintained with only 
the field Ü, transforming by (18.90), and we may as well discard S$ altogether. 
The dynamics of the Goldstone sector are then described by the non-linear 
c-model, with Lagrangian 


^ 2 ^ ^ 
fac TOD). (18.95) 


This is the most general Lagrangian that involves the Goldstone fields, exhibits 
the desired symmetry, and contains only two derivatives. 

Since É is invariant under the SU(2),xSU(2)r transformations (18.75), 
we can calculate the associated Noether currents (problem 18.6), obtaining 


iv? 


Tr[r,Ü O^ Ü1 — v (0^0 )U1] 5 = 


^ ^ —iv? 


SAU - 


Tr(r,UO"Ü), (18.96) 


» . 2 n x N " P 2 " " 
j'«(8) = T«r (^01) — «ÜT0^0] - —  TwÜUi8"Ü). (18.97) 


The axial ‘R — L’ current is then 


^ iv? 


ji (0) = ^ Tel (00^! — ÜT9^D), (18.98) 


and the vector ‘R + L’ current is 


+2 
jt (0) = Tr (Dor! + tort) (18.99) 
Expanding (18.98) in powers of the pion field, we find 

3I 5(0) = viit... (18.100) 
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which we may compare with (17.93). Just as in equation (17.94), (18.100) 
implies that this axial current has a matrix element between the vacuum and 
the one-Goldstone state: 


(0132, (U)]m;, p) = —ip"ve IP*ói. (18.101) 


Now comes the pay-off: this is the same symmetry current which enters into 
weak interactions, for which we already defined the vacuum-to-one-particle 
matrix element in terms of the pion decay constant fr, via equation (18.46). 
Comparing (18.101) with (18.46) we identify 


v= fr. (18.102) 


'Thus, finally, the dynamics of our massless pions, to lowest order in an ex- 
pansion in powers of momenta, is given by the Lagrangian 


= : f2Tr(0,00"U"). (18.103) 


It is quite remarkable that the low energy dynamics of the (massless) Gold- 
stone modes is completely determined in terms of one constant, measurable 
in 7 decay. 

The Lagrangian of (18.103) is an example of an effective Lagrangian. By 
this is meant, broadly, any Lagrangian which involves the presumed relevant 
degrees of freedom (here the Goldstone modes), and respects desired sym- 
metries of the theory. Evidently it is implied that there is some ‘underlying 
theory’, couched in terms of different degrees of freedom (here quarks and 
gluons), from which the symmetries have been abstracted. It is important 
to realize that an effective Lagrangian may or may not be renormalizable. 
Whereas our starting Lagrangian L is renormalizable, Lo is not: clearly the 
latter contains terms with arbitrarily many pion fields, which are operators 
of arbitrarily high dimension, compensated by negative powers of f2. As it 
stands, É can only be used at tree level — as, for example, in the calculation 
of x — m scattering using the interaction(18.94), for which the amplitude has 
an energy dependence of the form E?/f?, where E is the order of magnitude 
of the particles’ energy or momentum. This interaction has mass dimension 6, 
and its coupling 1/f? has dimension (mass) ?, like the 4-fermion interaction 
considered in section 11.8. It is therefore not renormalizable. However, the 
argument of section 11.8 suggests that a loop-by-loop renormalization pro- 
gramme is possible, and this was shown to be the case by Weinberg (19792). 
Each loop built from the interaction (18.94) will carry an extra two powers 
of energy, to compensate the 1/f? in the coupling. Thus f; (or perhaps this 
multiplied by factors like 4 and 7, if we are lucky) provides the energy scale 
characteristic of a non-renormalizable theory: as we go up in energy, we need 
more loops. But, at each loop order new divergences appear, which require 
additional counter terms for renormalization. Thus at any given order in 
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E?/f2, we must ensure that our effective Lagrangian contains all the appro- 
priate counter terms which are allowed by the symmetry. For example, at 
one-loop order for £2, we need to include the 4-derivative terms 


La = c Tr(0,00"U010,U8"U") + c9 Tr(0,U80,UT0"ÜO"Ü T). — (18.104) 


To perform a one-loop calculation, one uses É» at tree-level and in one-loop 
diagrams, and L, at tree-level only. 

Real pions, however, are not massless, nor are real quarks. We need to 
extend our effective Lagrangian to include explicit chiral symmetry breaking 
mass terms. 


18.3.2 Inclusion of explicit symmetry breaking: masses for 
pions and quarks 


Consider the term " 

Êm, = TET 4 O1). (18.105) 
This is invariant only under the restricted set of transformations with ay = 
ar, that is transformations such that Up = UL, for then TrÜ — Tr(URUU,, j= 
TrU. Such transformations form the SU(2) flavour isospin group. The term 
(18.105) breaks the axial isospin group explicitly, which would correspond 
to transformations with a, = —«mg, or equivalently Ur, = Ul, under which 
Û > ULÛUL. Expanding (18.105) to second order in the pion fields, we find 
the term 


A 1 
Équad,m; = -gaT (18.106) 


which, together with (18.93), shows that the pion field now has mass m;. 
Higher-order terms can be added, m? counting as equivalent to two deriva- 
tives. The low energy expansion is now an expansion in both the energy E 
and the pion mass my. This is called chiral perturbation theory, or ChPT for 
short. 

For example, to calculate 7 — 7 scattering to order E?, we use £s + Êm. 
at tree-level, expanded up to fourth power in the pion fields. The result is 
to change the amplitude for 1*2? — m"? from i(p', — p+)”/ f2 to i[(p', — 
p+)? — m2]/f2. By considering the general 7 — m amplitude, predictions for 
the scattering lengths can be made for low energy observables, for example 
the s-wave scattering lengths ao and ag in the isospin 0 and 2 channels. The 
results (first calculated by Weinberg 1966 using current algebra techniques) 
are 


Tm2 m2 
— 0.16 m, = 4 18.1 
a= yu = 016m," ay qm = -005m; (18.107) 
The experimental values are ag = 0.26 + 0.05 m=! and ag = —0.028 + 


0.012 m=!, as given by Donoghue et al. (1992). The next order in ChPT 


T ? 


improves upon these results. 
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A systematic exposition of ChPT at the one-loop level was given by Gasser 
and Leutwyler (1984). Bijnens et al. (1996) carried the m — 7 calculation to 
two-loop order; see also Colangelo et al. (2001). 

It is clear that there must be some relation between the masses of the u 
and d quarks (in the SU(2) flavour case) and the pion mass, since the latter 
must vanish in the limit m, — mq — 0. To see this connection, we consider 
the quark mass term in the 2-flavour QCD Lagrangian, which is 


—Gm2g, m2 = ( T pi ) (18.108) 
Let us now redefine the quark fields (compare (17.107) and (18.17)) by 


= exp[-ir - i 75/(2fz)] f. (18.109) 


Q> 


This transformation is a perfectly good parametrization of the Goldstone fields 
associated with the axial symmetry (18.17), and effectively removes them from 
the new fermion fields f. The quark mass term now becomes 


—fexp|-ir - 455/(2f.)] maexp[-ir - &/(2f.)]f. (18.110) 


We now make the assumption that the axial SU(2) is spontaneously broken 
in QCD, by imposing a non-zero vev on the symmetry-breaking operator f f ; 


(0/50) — —/2B55 (ij = 1,2). (18.111) 


Expanding (18.110) up to second order in the pion fields, retaining just the 
expectation value of the fermion bilinear*, we find a mass term 


1 
-5B(m. + ma)ft^, (18.112) 


from which the relation (Gasser and Leutwyler 1982) 


m = -EETA olf fo) (18.113) 


follows, where ff represents either fafa or fafa: From (18.113) we can see 
that the square of the pion mass is proportional to the average u-d quark mass 
(provided of course that B does not accidentally vanish), and goes to zero as 
they do; (0| f f|0) is the ‘chiral condensate’ (cf figure 18.1). 

Lattice QCD (see chapter 16) can be used to test equation (18.113), since 
simulations can be done for a range of quark masses, and the relation between 
mŽ and mu,q can be checked. Conversely, ChPT can assist lattice QCD 
calculations by guiding the extrapolation of the calculated results to quark 


“A formal justification of this step is provided by Weinberg (1996), section 19.6. 
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mass values lighter than can presently be simulated. For example, Noaki et al. 
(2008) have reported the results of such a calculation, using 2 light dynamical 
quark flavours, in the overlap fermion formalism (Neuberger 1998a, 1998b), 
which preserves chiral symmetry at finite lattice spacing. Their pion masses 
ranged from 290 MeV to 750 MeV, and they compared their results with the 
predictions of ChPT at one-loop (Gasser and Leutwyler 1984) and two-loop 
(Colangelo et al. 2001). They found good fits to the ChPT formulae, and 
extracted quark masses (in the MS scheme at the scale 2 GeV) of about 4.5 
MeV; they also found |(0|ff|0) ~ (235 GeV)?, in the MS scheme at 2 GeV 
scale. Studies by this and other groups are continuing, with 3 light flavours, 
lighter pion masses, and other lattice fermion formalisms. 


18.3.3 Extension to SU(3);,xSU(3):Rr 


To the extent that the strange quark is also ‘light’ on hadronic scales, the 
QCD Lagrangian has the larger symmetry of SU(3)¢, xSU(3)rn, which breaks 
spontaneously so as to preserve the flavour symmetry SU(3);r, and produce 
an SU(3) octet of pseudoscalar Goldstone bosons: 1*, 7°, K+, K}, K? and ns 
(see figure 12.4). The effective Lagrangian approach to the dynamics of the 
Goldstone fields can be easily extended to chiral SU(3). One simply replaces 
Ü = exp(ir - #/fr) by V = exp(iA- J fr) where 


ES ^ ur veh 1 p l5 T 
Z » = wan a veils E oie (18.114) 
B K K usé 
One easily verifies that the kinetic terms in 
, f Ds 
Ly = "E Tr0, OP V (18.115) 


have the correct normalization, using TrA;A, = 2045. The 3-flavour quark 
mass term is now 


— f exp[-id - $95 /(2fr)] ma exp[-iÀ - $35/(2/:)] f. (18.116) 
where 
my, 0 0 
ms—-| 0 ma 0 |. (18.117) 
0 0 ms 


The axial SU(3) symmetry breaking vev is 
(0/,f,0) = —/2B&; (i,j =1,2,3) (18.118) 
and the meson mass term is 


-$TH( -)?ms3}. (18.119) 
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This yields (problem 18.7) 


m2, = m2, = B(m, + ma), (18.120) 
m% = B(mu+ms), (18.121) 
m2. = B(ma + ms, ) (18.122) 
1 
Mp, = 3 Pm + ma + 4m), (18.123) 
and there is also a term which mixes 7° and ra: 
B 
i= wae — ma). (18.124) 


It is interesting that the charged and neutral pions have the same mass, even 
though we have made no assumption about the ratio of m, to ma. The 
observed pion mass differences arise from electromagnetism. 

If we ignore for the moment electromagnetic mass differences, we can de- 
duce from (18.120)-(18.122) the relation 


m; E Mu + ma 


(18.125) 


2 2 
2mg — mz Ms 


The left-hand side is approximately equal to 0.04, so we learn that the non- 
strange quarks are about 1/25 times as heavy as the strange quark. We also 
obtain à 

Mp, = 3 mk — m2), (18.126) 
which is the Gell-Mann-Okubo formula for the (squared) masses of the pseu- 
doscalar meson octet (Gell-Mann 1961, Okubo 1962). Using average values 
for the K and 7 masses, the relation (18.126) predicts m7, = 566 MeV, quite 
close to the 7 (548 MeV). 

Further progress requires the inclusion of electromagnetic effects, since my 
and ma are themselves comparable to the electromagnetic mass differences. 
Including these effects, Weinberg (1996) estimates 

TM 0.050, — 


Ms Ms 


& 0.027; (18.127) 


see also Leutwyler (1996). Note that the d quark is almost twice as heavy as 
the u quark: according to QCD, the origin of SU(2) isospin symmetry is not 
that my © ma, but that both are very small compared with, say, Axrs. 

All the results we have given are subject to correction by the inclusion 
of higher-order effects in the ChPT expansion. In the case of chiral SU(3), 
the fourth-order Lagrangian £4 contains 8 terms (Gasser and Leutwyler 1984, 
1985). Donoghue et al. (1992) give a clear exposition of ChPT to one-loop 
order. 
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18.4 Chiral anomalies 


In all our discussions of symmetries so far — unbroken, approximate, and 
spontaneously broken — there is one result on which we have relied, and never 
queried. We refer to Noether’s theorem, as discussed in section 12.3.1. This 
states that for every continuous symmetry of a Lagrangian, there is a cor- 
responding conserved current. We demonstrated this result in some special 
cases, but we have now to point out that while it is undoubtedly valid at 
the level of the classical Lagrangian and field equations, we did not inves- 
tigate whether quantum corrections might violate the classical conservation 
law. This can, in fact, happen and when it does the afflicted current (or its 
divergence) is said to be ‘anomalous’, or to contain an ‘anomaly’. General 
analysis shows that anomalies occur in renormalizable theories of fermions 
coupled to both vector and axial vector currents. We may therefore expect to 
find anomalies among the vector and axial vector flavour currents which we 
have been discussing. 

One way of understanding how anomalies arise is through consideration of 
the renormalization process, which is in general necessary once we get beyond 
the classical (‘tree level’) approximation. As we saw in volume 1, this will 
invariably entail some regularization of divergent integrals. But the specific 
example of the O(e?) photon self-energy studied in section 11.3 showed that 
a simple cut-off form of regularization already violated the current conserva- 
tion (or gauge invariance) condition (11.21). In that case, it was possible to 
find alternative regularizations which respected electromagnetic current con- 
servation, and were satisfactory. Anomalies arise when both axial and vector 
symmetry currents are present, since it is not possible to find a regulariza- 
tion scheme which preserves both vector and axial vector current conservation 
(Adler 1970, Jackiw 1972, Adler and Bardeen 1969). 

We shall not attempt an extended discussion of this technical subject. But 
we do want to alert the reader to the existence of these anomalies; to indicate 
how they arise in one simple model; and to explain why, in some cases, they 
are to be welcomed, while in others they must be eliminated. 

We consider the classic case of 7° — yy, in the context of spontaneously 
broken chiral flavour symmetry, with massless quarks and pions. The axial 
isospin current J s(x) should then be conserved, but we shall see that this 
implies that the amplitude for 7° — yy must vanish, as first pointed out 
by Veltman (1967) and Sutherland (1967). We begin by writing the matrix 
element of 2 s(x) between the vacuum and a 2y state, in momentum space, 
as 


f d. k1, 6; y, k2, e213% 5 (2)]0) 


= (2r)t8t (kı + ke — q)et, (k1)edy (k2) M"? (kı, k2). (18.128) 
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FIGURE 18.4 
The amplitude considered in (18.128), and the one pion intermediate state 
contribution to it. 


As in figure 18.3, one contribution to M“”* has the form (constant /q?) due to 
the massless 1? propagator, shown in figure 18.4. This is, once again, because 
when chiral symmetry is spontaneously broken, the axial current connects the 
pion state to the vacuum, as described by the matrix element (18.46). The 
contribution of the process shown in figure 18.4 to M^ is then 


ig^ fr 2 ide? kiakag (18.129) 


where the 7° — yy amplitude is Ae”? ež (ki)e5, (ko) k1ak2g. Note that this 
automatically incorporates electromagnetic gauge invariance (the amplitude 
vanishes when the polarization vector of either photon is replaced by its 4- 
momentum, due to the antisymmetry of the e symbol), and it is symmetrical 
under interchange of the photon labels. Now consider replacing Jy s(x) in 
(18.128) by psec Ce); which should be zero. A partial integration then shows 
that this implies that 

q, M""^ = 0 (18.130) 


which with (18.129) implies that A = 0, and hence that 7° — yy is forbidden. 
It is important to realize that all other contributions to M“”, apart from 
the 7° one shown in figure 18.4, will not have the 1/q? factor in (18.129), and 
will therefore give a vanishing contribution to q,.M“”* at q? = 0 which is the 
on-shell point for the (massless) pion. 

It is of course true that m2 ¢ 0. But estimates (Adler 1969) of the conse- 
quent corrections suggest that the predicted rate of 7° — yy for real 7°’s is 
far too small. Consequently, there is a problem for the hypothesis of sponta- 
neously broken (approximate) chiral symmetry. 

In such a situation it is helpful to consider a detailed calculation performed 
within a specific model. This is supplied by Itzykson and Zuber (1980), sec- 
tion 11.5.2; in essentials it is the same as the one originally considered by 
Steinberger (1949) in the first calculation of the n? — yy rate, and subse- 
quently by Bell and Jackiw (1969) and by Adler (1969). It employs (scalar) 
c and (pseudoscalar) 7° meson fields, augmented by a fermion of mass m 
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(a) ! (5) ' 


FIGURE 18.5 
The two O(a) graphs contributing to 7° — yy decay. 


and charge +e, representing the proton. To order a, there are two graphs to 
consider, shown in figure 18.5(a) and (b). It turns out that the fermion loop 
integral is actually convergent. In the limit q? — 0 the result is 


e2 


A= me (18.131) 


where A is the 7° — yy amplitude introduced above. Problem 18.8 evaluates 
the 7° — yy rate using (18.131), to give 


(18.132) 


(18.132) is in very good agreement with experiment. 

In principle, various possibilities now exist. But a careful analysis of the 
‘triangle’ graph contributions to the matrix element M“”* of (18.128), shown 
in figure 18.6, reveals that the fault lies in assuming that a regularization exists 
such that for these amplitudes the conservation equation q, (47/5 5(0)|0) = 0 
can be maintained, at the same time as electromagnetic gauge variance. In 
fact, no such regularization can be found. When the amplitudes of figure 
18.6 are calculated using an (electromagnetic) gauge invariant procedure, one 
finds a non-zero result for qq Qryos 5(0)]0) (again the details are given in Itzyk- 
son and Zuber (1980)). This implies that O,j/ (x) is not zero after all, the 
calculation producing the specific value 


2 
^ € avBr TF P 
3s (2) = — 355€ PA Fr FA (18.133) 
where the F’s are the usual electromagnetic field strengths. 
Equation (18.133) means that (18.130) is no longer valid, so that A need 
no longer vanish: indeed, (18.133) predicts a definite value for A, so we need to 


see if it is consistent with (18.131). Taking the vacuum — 2y matrix element 
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FIGURE 18.6 
O(a) contributions to the matrix element in (18.128). 


of (18.133) produces (problem 18.9) 


2 
ig, M^"^ = 25€" P kia kag (18.134) 


which is indeed consistent with (18.128) and (18.131), after suitably inter- 
changing the labels on the e symbol. 

Equation (18.133) is therefore a typical example of ‘an anomaly’ — the 
violation, at the quantum level, of a symmetry of the classical Lagrangian. It 
might be thought that the result (18.133) is only valid to order a (though the 
O(a?) correction would presumably be very small). But Adler and Bardeen 
(1969) showed that such 'triangle' loops give the only anomalous contributions 
to the Jis — y — y vertex, so that (18.133) is true to all orders in a. 

The triangles considered above actually used a fermion with integer charge 
(the proton). We clearly should use quarks, which carry fractional charge. 
In this case, the previous numerical value for A is multiplied by the factor 
T3Q° for each contributing quark. For the u and d quarks of chiral SU(2) x 
SU(2), this gives 1/3. Consequently agreement with experiment is lost unless 
there exist three replicas of each quark, identical in their electromagnetic and 
SU(2) x SU(2) properties. Colour supplies just this degeneracy, and thus the 
n° — yy rate is important evidence for such a degree of freedom, as we noted 
in chapter 14. 

In the foregoing discussion, the axial isospin current was associated with a 
global symmetry; only the electromagnetic currents (in the case of 79 > y7) 
were associated with a local (gauged) symmetry, and they remained conserved 
(anomaly free). If, however, we have an anomaly in a current associated with 
a local symmetry, we will have a serious problem. The whole rather elaborate 
construction of a quantum gauge field theory relies on current conservation 
equations such as (11.21) or (13.130) to eliminate unwanted gauge degrees 
of freedom, and ensure unitarity of the S-matrix. So anomalies in currents 
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coupled to gauge fields cannot be tolerated. As we shall see in chapter 20, 
and is already evident from (18.48), axial currents are indeed present in weak 
interactions and they are coupled to the WŁ, Z° gauge fields. Hence, if this 
theory is to be satisfactory at the quantum level, all anomalies must somehow 
cancel away. That this is possible rests essentially on the observation that the 
anomaly (18.133) is independent of the mass of the circulating fermion. Thus 
cancellations are in principle possible between quark and lepton ‘triangles’ 
in the weak interaction case. Bouchiat et al. (1972) were the first to point 
out that, for each generation of quarks and leptons, the anomalies will cancel 
between quarks and leptons if the fractionally charged quarks come in three 
colours. The condition that anomalies cancel in the gauged currents of the 
Standard Model is the remarkably simple one (Ryder 1996, p384): 


Ne(Qu + Qa) + Qe = 0 (18.135) 


where N, is the number of colours and Qu, Qa and Qe are the charges (in units 
of e) of the ‘u’, ‘d’, and ‘e’ type fields in each generation. Clearly (18.135) 
is true for each generation of the Standard Model; the condition indicates 
a remarkable connection, at some deep level, between the facts that quarks 
come in three colours and have charges which are 1/3 fractions. The Standard 
Model provides no explanation for this connection. Anomaly cancellation is a 
powerful constraint on possible theories ('t Hooft 1980, Weinberg 1996 section 
22.4). 


= 
Problems 

18.1 Verify (18.24)-(18.26). 

18.2 Verify (18.28)- (18.30). 

18.3 Show that £4 of (18.12) can be written as (18.39). 


18.4 Show that the rate for 7+ — p*vj,, calculated from the lowest-order 
matrix element (18.52), is given by (18.53). 


18.5 Verify the transformation equations (18.76) and (18.77). 
18.6 


(a) Consider a Lagrangian L( br, 0,0.) where the ¢,. could be either bosonic 
or fermionic fields. Let the fields transform by an infinitesimal local 
(x-dependent) transformation 


br — br — i€g(x)T% bs (sum on s). 
Show that the change in £ may be written as 
SÊ = j^" (x)0, e. (a) + ea (2)0,,?" (a) 
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(b) 
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where at ƏS) 
Sap xr) = —iT@ 2 Of = E 
j^" (x) Ty sPs 0(8,À,.) O(O,es (x))) 
and ^ 
8,3?" (x) = M (1) 


Deduce that if £ is invariant under the global form of this transforma- 
tion (i.e. constant €a), then the current defined by (1) is conserved. 
[This procedure for finding conserved currents for global symmetries 
is due to Gell-Mann and Levy (1960).] 


Apply the method of part (a) to verify the form of the currents (18.96) 
and (18.97). 


18.7 Verify equations (18.120)-(18.124). 
18.8 Verify (18.132), and calculate the 7° lifetime in seconds. 
18.9 Verify (18.134). 
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Spontaneously Broken Local Symmetry 


In earlier parts of this book we have briefly indicated why we might want to 
search for a gauge theory of the weak interactions. The reasons include: (i) 
the goal of unification (e.g. with the U(1) gauge theory QED), as mentioned in 
section 1.3.5; and (ii) certain ‘universality’ phenomena (to be discussed more 
fully in chapter 20), which are reminiscent of a similar situation in QED (see 
comment (ii) in section 2.6, and also section 11.6), and which are particularly 
characteristic of a non-Abelian gauge theory, as pointed out in section 13.1 
after equation (13.44). However, we also know from section 1.3.5 that weak 
interactions are short-ranged, so that their mediating quanta must be massive. 
At first sight, this seems to rule out the possibility of a gauge theory of weak 
interactions, since a simple gauge boson mass violates gauge invariance, as we 
pointed out for the photon in section 11.3 and for non-Abelian gauge quanta 
in section 13.3.1, and will review again in the following section. Nevertheless, 
there is a way of giving gauge field quanta a mass, which is by ‘spontaneously 
breaking’ the gauge (i.e. local) symmetry. This is the topic of the present 
chapter. The detailed application to the electroweak theory will be made in 
chapter 21. 


EE: SeSe 


19.1 Massive and massless vector particles 


Let us begin by noting an elementary (classical) argument for why a gauge 
field quantum cannot have mass. The electromagnetic potential satisfies the 
Maxwell equation (cf (2.22)) 


A" — 8 (5, AP) = fh (19.1) 


which, as discussed in section 2.3, is invariant under the gauge transformation 
A" — Al! = A" — gx. (19.2) 


However, if A" were to represent a massive field, the relevant wave equation 
would be 


(0 + M?)4" — 0"(8, A") = fon: (19.3) 
To get this, we have simply replaced the massless ‘Klein—Gordon’ operator 
by the corresponding massive one, O + M? (compare sections 3.1 and 5.3). 
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FIGURE 19.1 
Fermion-fermion scattering via exchange of two X bosons. 


Equation (19.3) is manifestly not invariant under (19.2), and it is precisely 
the mass term M? A" that breaks the gauge invariance. The same conclusion 
follows in a Lagrangian treatment; to obtain (19.3) as the corresponding Euler- 
Lagrange equation, one adds a mass term +3M ? A,A” to the Lagrangian of 
(7.66) (see also sections 11.4 and 13.3.1), and this clearly violates invariance 
under (19.2). Similar reasoning holds for the non-Abelian case too. Perhaps, 
then, we must settle for a theory involving massive charged vector bosons, 
W= for example, without it being a gauge theory. 

Such a theory is certainly possible, but it will not be renormalizable, as 
we now discuss. Consider figure 19.1, which shows some kind of fermion- 
fermion scattering (we need not be more specific), proceeding in fourth order of 
perturbation theory via the exchange of two massive vector bosons, which we 
will call X-particles. To calculate this amplitude, we need the propagator for 
the X-particle, which can be found by following the ‘heuristic’ route outlined in 
section 7.3.2 for photons. We consider the momentum-space version of (19.3) 
for the corresponding X" field, but without the current on the right-hand side 
(so as to describe a free field): 


[(—&? + M?)g"" + k"k"] X, (k) = 0, (19.4) 


which should be compared with (7.90). Apart from the ‘ie’, the propagator 
should be proportional to the inverse of the quantity in the square brackets 
in (19.4). Problem 19.1 shows that unlike the (massless) photon case, this 
inverse does exist, and is given by 


=g” + ktk” /M? 


ERIS (19.5) 


A proper field-theoretic derivation would yield this result multiplied by an 
overall factor ‘i’ as usual, and would also include the ‘ie’ via k? — M? > 
k? — M? + ie. We remark immediately that (19.5) gives nonsense in the limit 
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M — 0, thus indicating already that a massless vector particle seems to be a 
very different kind of thing from a massive one (we can’t just take the massless 
limit of the latter). 

Now consider the loop integral in figure 19.1. At each vertex we will have 
a coupling constant g, associated with an interaction Lagrangian having the 
general form gb yx H (a Ys coupling could also be present but will not 
affect the argument). Just as in QED, this ‘g’ is dimensionless but, as we 
warned the reader in section 11.8, this may not guarantee renormalizability, 
and indeed this is a case where it does not. To get an idea of why this might 
be so, consider the leading divergent behaviour of figure 19.1. This will be 
associated with the k"k" terms in the numerator of (19.5), so that the leading 
divergence is effectively 


k” k” kPk7\ 11 
~ | dtk == 19.6 
06m) 6) gs EY 
for high k-values (we are not troubling to get all the indices right, we are 
omitting the spinors altogether, and we are looking only at the large k part 


of the propagators). Now the first two bracketed terms in (19.6) behave like 
a constant at large k, so that the divergence is effectively 


^2 4 LÁ 
fa ky 7 (19.7) 


which is quadratically divergent, and indeed exactly what we would get in a 
‘four-fermion’ theory — see (11.98) for example. This strongly suggests that 
the theory is non-renormalizable. 

Where have these dangerous powers of k in the numerator of (19.6) come 
from? The answer is simple and important. They come from the longitudinal 
polarization state of the massive X-particle, as we shall now explain. The 
free-particle wave equation is 


(0+ M2)x" — 8” (ô, X”) =0 (19.8) 


and plane wave solutions have the form 
X” = Pehe, (19.9) 
Hence the polarization vectors e” satisfy the condition 
(=k? + M?)e" + k"k,e" = 0. (19.10) 
Taking the ‘dot’ product of (19.10) with k, leads to 
M?k. e — 0, (19.11) 


which implies (for M? # 0!) 
k-e=0. (19.12) 
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Equation (19.12) is a covariant condition, which has the effect of ensuring 
that there are just three independent polarization vectors, as we expect for a 
spin-1 particle. Let us take k” = (Kk9,0,0,|k|); then the z- and y-directions 
are ‘transverse’ while the z-direction is ‘longitudinal’. Now, in the rest frame 
of the X, such that krest = (M,0,0,0), (19.12) reduces to €? = 0, and we may 
choose three independent e's as 


e! (krest, A) = (0, €(A)) (19.13) 

with 
e(A=+1) = ¥271/2(1,+i,0) (19.14) 
e(\=0) = (0,0,1). (19.15) 


The e’s are ‘orthonormalized’ so that (cf (7.86)) 
e(A)* s e(A^) = Oxy. (19.16) 


These states have definite spin projection (A = +1,0) along the z-axis. For 
the result in a general frame, we can Lorentz transform e" (krest, A) as required. 
For example, in a frame such that k” = (k?,0,0, |k|), we find 


€" (k, A = +1) = e” (krest, A = £1) (19.17) 
as before, but the longitudinal polarization vector becomes (problem 19.2) 
e! (k, A = 0) = M~*(|k|, 0,0, k°). (19.18) 


Note that k - e"(k, A = 0) = 0 as required. 
From (19.17) and (19.18) it is straightforward to verify the result (problem 
19.3) 
XO (k, A)e”*(k, X) = —g"" + kk" /M?. (19.19) 
A=0,+1 


Consider now the propagator for a spin-1/2 particle, given in (7.63): 


i(k +m) 


ate (19.20) 


Equation (7.64) shows that the factor in the numerator of (19.20) arises from 
the spin sum 


So ualk, s)tig(k, s) = (K+ m)ag. (19.21) 


In just the same way, the massive spin-1 propagator is given by 


i[-9^" + kk" /M?] 


rame (19.22) 
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the numerator in (19.22) arising from the spin sum (19.19). Thus the danger- 
ous factor k"k" /M? can be traced to the spin sum (19.19): in particular, at 
large values of k the longitudinal state e"(k, A = 0) is proportional to k”, and 
this is the origin of the numerator factors k"k"/M? in (19.22). 

We shall not give further details here (see also section 22.1.2), but merely 
state that theories with massive charged vector bosons are indeed non-renor- 
malizable. Does this matter? In section 11.8 we explained why it is thought 
that the relevant theories at presently accessible energy scales should be renor- 
malizable theories. And, apart from anything else, they are much more pre- 
dictive. Is there, then, any way of getting rid of the offending 'k" Ek" terms 
in the X-propagator, so as (perhaps) to render the theory renormalizable? 
Consider the photon propagator of chapter 7 repeated here: 


: v vii 
Msg esee e cO RUNI (19.23) 
k? + ie 
This contains somewhat similar factors of k“k” (admittedly divided by k? 
rather than M?), but they are gauge-dependent, and can in fact be ‘gauged 
away’ entirely, by choice of the gauge parameter € (namely by taking € = 1). 
But, as we have seen, such ‘gauging’ — essentially the freedom to make gauge 
transformations — seems to be possible only in a massless vector theory. 

A closely related point is that, as section 7.3.1 showed, free photons exist 
in only two polarization states (electromagnetic waves are purely transverse), 
instead of the three we might have expected for a vector (spin-1) particle — 
and as do indeed exist for massive vector particles. This gives another way 
of seeing in what way a massless vector particle is really very different from 
a massive one: the former has only two (spin) degrees of freedom, while the 
latter has three, and it is not at all clear how to ‘lose’ the offending longitudinal 
state smoothly (certainly not, as we have seen, by letting M — 0 in (19.5)). 

These considerations therefore suggest the following line of thought: is it 
possible somehow to create a theory involving massive vector bosons, in such 
a way that the dangerous k” k” term can be ‘gauged away’, making the theory 
renormalizable? The answer is yes, via the idea of spontaneous breaking of 
gauge symmetry. This is the natural generalization of the spontaneous global 
symmetry breaking considered in chapter 17. By way of advance notice, the 
crucial formula is (19.74) for the propagator in such a theory, which is to be 
compared with (19.22). 

The first serious challenge to the then widely held view that electro- 
magnetic gauge invariance requires the photon to be massless was made by 
Schwinger (1962), as we pointed out in section 11.4. Soon afterwards, Ander- 
son (1963) argued that several situations in solid state physics could be inter- 
preted in terms of an effectively massive electromagnetic field. He outlined a 
general framework for treating the phenomenon of the acquisition of mass by 
a gauge boson, and discussed its possible relevance to contemporary attempts 
(Sakurai 1960) to interpret the recently discovered vector mesons (p,w,¢@...) 
as the gauge quanta associated with a local extension of hadronic flavour sym- 
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metry. From his discussion, it is clear that Anderson had his doubts about 
the hadronic application, precisely because, as he remarked, gauge bosons can 
only acquire a mass if the symmetry is spontaneously broken. This has the 
consequence, as we saw in chapter 17, that the multiplet structure ordinarily 
associated with a non-Abelian symmetry would be lost. But we know that 
flavour symmetry, even if admittedly not exact, certainly leads to identifiable 
multiplets, which are at least approximately degenerate in mass. It was Wein- 
berg (1967) and Salam (1968) who made the correct application of these ideas, 
to the generation of mass for the gauge quanta associated with the weak force. 
There is, however, nothing specifically relativistic about the basic mechanism 
involved, nor need we start with the non-Abelian case. In fact, the physics 
is well illustrated by the non-relativistic Abelian (i.e. electromagnetic) case 
— which is nothing but the physics of superconductivity. Our presentation is 
influenced by that of Anderson (1963). 


EE: SeSe 


19.2 The generation of ‘photon mass’ in a super- 
conductor: Ginzburg-Landau theory and 
the Meissner effect 


In chapter 17, section 17.7, we gave a brief introduction to some aspects 
of the BCS theory of superconductivity. We were concerned mainly with 
the nature of the BCS ground state, and with the non-perturbative origin 
of the energy gap for elementary excitations. In particular, as noted after 
(17.128), we omitted completely all electromagnetic couplings of the electrons 
in the ‘microscopic’ Hamiltonian. It is certainly possible to complete the BCS 
theory in this way, so as to include within the same formalism a treatment of 
electromagnetic effects (e.g. the Meissner effect) in a superconductor. We refer 
interested readers to the book by Schrieffer (1964), chapter 8. Instead, we shall 
follow a less ‘microscopic’ and somewhat more ‘phenomenological’ approach, 
which has a long history in theoretical studies of superconductivity, and is in 
some ways actually closer (at least formally) to our eventual application in 
particle physics. 


In section 17.3.1 we introduced the concept of an ‘order parameter’, a 
quantity which was a measure of the ‘degree of ordering’ of a system below 
some transition temperature. In the case of superconductivity, the order pa- 
rameter (in this sense) is taken to be a complex scalar field w, as originally 
proposed by Ginzburg and Landau (1950), well before the appearance of BCS 
theory. Subsequently, Gorkov (1959) and others showed how the Ginzburg- 
Landau description could be derived from BCS theory, in certain domains of 
temperature and magnetic field. This work all relates to static phenomena. 
More recently, an analogous ‘effective theory’ for time-dependent phenomena 
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(at zero temperature) has been derived from a BCS-type model (Aitchison et 
al. 1995). For the moment, we shall follow a more qualitative approach. 

The Ginzburg-Landau field v is commonly referred to as the ‘macroscopic 
wave function'. This terminology originates from the recognition that in the 
BCS ground state a macroscopic number of Cooper pairs have ‘condensed’ 
into the state of lowest energy, a situation similar to that in the Bogoliubov 
superfluid. Further, this state is highly coherent, all pairs having the same 
total momentum (namely zero, in the case of (17.140)). These considerations 
suggest that a successful phenomenology can be built by invoking the idea of 
a macroscopic wavefunction w, describing the condensate. Note that w is a 
‘bosonic’ quantity, referring essentially to paired electrons. Perhaps the single 
most important property of wv is that it is assumed to be normalized to the 
total density of Cooper pairs n, via the relation 


|W? = ne = ns/2 (19.24) 


where ns is the density of superconducting electrons. The quantities n. and 
ns will depend on temperature T, tending to zero as T approaches the su- 
perconducting transition temperature T, from below. The precise connection 
between yw and the microscopic theory is indirect; in particular, v has no 
knowledge of the coordinates of individual electron pairs. Nevertheless, as an 
‘empirical’ order parameter, it may be thought of as in some way related to 
the ground state ‘pair’ expectation value introduced in (17.121): in particular, 
the charge associated with w is taken to be —2e, and the mass is 2me. 

'The Ginzburg-Landau description proceeds by considering the quantum- 
mechanical electromagnetic current associated with w, in the presence of a 
static external electromagnetic field described by a vector potential A. This 
current was considered in section 2.4, and is given by the gauge-invariant form 
of (A.7), namely 


(V* (V + 2ieA)v — ((V + 2ieA) v" y]. (19.25) 


; —2e 
Jem = amu 
Note that we have supplied an overall factor of —2e to turn the Schródinger 
‘number density’ current into the appropriate electromagnetic current. As- 
suming now that, consistently with (19.24), w is varying primarily through 
its phase degree of freedom ¢, rather than its modulus ||, we can rewrite 
(19.25) as 
2e? 


; 1 
fon = -2 (44 vo) [ol (19.26) 


e 


where y = e'®|7)|. We easily verify that (19.26) is invariant under the gauge 
transformation (2.41), which can be written in this case as 


A > A+Vx (19.27) 
ọ > ó$-92ex. (19.28) 
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We now replace ||? in (19.26) by n,/2 in accordance with (19.24), and take 
the curl of the resulting equation to obtain 


2 
VX jen =- (=) B. (19.29) 
Me 
Equation (19.29) is known as the London equation (London 1950), and is one 
of the fundamental phenomenological relations in superconductivity. 
The significance of (19.29) emerges when we combine it with the (static) 
Maxwell equation 
VxB=jm (19.30) 


Taking the curl of (19.30), and using V x (V x B) = V(V- B) — V? B and 
V - B — 0, we find 


2 

V?B- (=) B. (19.31) 
The variation of magnetic field described by (19.31) is a very characteristic 
one encountered in a number of contexts in condensed matter physics. First, 


we note that the quantity (e?n,/m,) must — in our units — have the dimensions 
of (length) ?, by comparison with the left-hand side of (19.31). Let us write 


ens 1 
( - ) = (19.32) 


Next, consider for simplicity one-dimensional variation 


dB 1 


in the half-plane z > 0, say. Then the solutions of (19.33) have the form 
B(x) = Boexp —(z/X); (19.34) 


the exponentially growing solution is rejected as unphysical. The field there- 
fore penetrates only a distance of order A into the region x > 0. The range 
parameter A is called the screening length. This expresses the fact that, in a 
medium such that (19.29) holds, the magnetic field will be 'screened out? from 
penetrating further into the medium. 

The physical origin of the screening is provided by Lenz's law: when a 
magnetic field is applied to a system of charged particles, induced EMFs are 
set up which accelerate the particles, and the magnetic effect of the resulting 
currents tends to cancel (or screen) the applied field. On the atomic scale this 
is the cause of atomic diamagnetism. Here the effect is occurring on a macro- 
scopic scale (as mediated by the ‘macroscopic wavefunction’ y), and leads to 
the Meissner effect — the exclusion of flux from the interior of a superconduc- 
tor. In this case, screening currents are set up within the superconductor, 
over distances of order À from the exterior boundary of the material. These 
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exactly cancel — perfectly screen — the applied flux density in the interior. 
With ns ~ 4 x 107° m^? (roughly one conduction electron per atom) we find 


ps 1/2 
à= ( S ) x 1078 m, (19.35) 


which is the correct order of magnitude for the thickness of the surface layer 
within which screening currents flow, and over which the applied field falls to 
zero. As T — Te, n, — 0 and A becomes arbitrarily large, so that flux is no 
longer screened. 

It is quite simple to interpret equation (19.31) in terms of an 'effective 
non-zero photon mass'. Consider the equation (19.8) for a free massive vector 
field. Taking the divergence via à, leads to 


M?8,X" «0 (19.36) 


(cf (19.11)), and so (19.8) can be written as 


(+ M?)X" — 0, (19.37) 


which simply expresses the fact that each component of X" has mass M. Now 
consider the static version of (19.37), in the rest frame of the X-particle in 
which (see equation (19.13)) the v = 0 component vanishes. Equation (19.37) 
reduces to 

VUX-M?X (19.38) 


which is exactly the same in form as (19.31) (if X were the electromagnetic 
field A, we could take the curl of (19.38) to obtain (19.31) via B = V x A). 
The connection is made precise by making the association 


M? = (5) uh (19.39) 


Me 


Equation (19.39) shows very directly another way of understanding the ‘screen- 
ing length + photon mass’ connection: in our units h 2 c = 1, a mass has 
the dimension of an inverse length, and so we naturally expect to be able to 
interpret A^! as an equivalent mass (for the photon, in this case). 

'The above treatment conveys much of the essential physics behind the 
phenomenon of ‘photon mass generation’ in a superconductor. In particular, 
it suggests rather strongly that a second field, in addition to the electromag- 
netic one, is an essential element in the story (here, it is the w field). This 
provides a partial answer to the puzzle about the discontinuous change in 
the number of spin degrees of freedom in going from a massless to a massive 
gauge field: actually, some other field has to be supplied. Nevertheless, many 
questions remain unanswered so far. For example, how is all the foregoing 
related to what we learned in chapter 17 about spontaneous symmetry break- 
ing? Where is the Goldstone mode? Is it really all gauge invariant? And 
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what about Lorentz invariance? Can we provide a Lagrangian description of 
the phenomenon? The answers to these questions are mostly contained in the 
model to which we now turn, which is due to Higgs (1964) and is essentially 
the local version of the U(1) Goldstone model of section 17.5. 


EE: SeSe 


19.3 Spontaneously broken local U(1) symmetry: 
the Abelian Higgs model 


This model is just Éa of (17.69) and (17.77), extended so as to be locally, 
rather than merely globally, U(1) invariant. Due originally to Higgs (1964), it 
provides a deservedly famous and beautifully simple model for investigating 
what happens when a gauge symmetry is spontaneously broken. 

To make (17.69) locally U(1) invariant, we need only replace the 0’s by 
D’s according to the rule (7.123), and add the Maxwell piece. This produces 


. Nr Lcx ua aaa MM asus mne 
Ly = [(0"+igA" )ó] (Os -HiaÀ, 6] - 1 Eu, FP" -FAGIO t). (19.40) 

(19.40) is invariant under the local version of (17.72), namely 
à(z) — B(x) = e-i6 (096) (19.41) 


when accompanied by the gauge transformation on the potentials 


A^ () > A't (£) = Â! (£) + Latala), (19.42) 


Before proceeding any further, we note at once that this model contains four 
field degrees of freedom — two in the complex scalar Higgs field ĝ, and two in 
the massless gauge field A“. 

We learned in section 17.5 that the form of the potential terms in (19.40) 
(specifically the u? one) does not lend itself to a natural particle interpreta- 
tion, which only appears after making a ‘shift to the classical minimum’, as 
in (17.84). But there is a remarkable difference between the global and local 
cases. In the present (local) case, the phase of ó is completely arbitrary, since 
any change in @ of (19.41) can be compensated by an appropriate transfor- 
mation (19.42) on A“, leaving £g the same as before. Thus the field Ê in 
(17.84) can be ‘gauged away’ altogether, if we choose! But Ó was precisely the 
Goldstone field, in the global case. This must mean that there is somehow 
no longer any physical manifestation of the massless mode. This is the first 
unexpected result in the local case. We may also be reminded of our desire to 
‘gauge away’ the longitudinal polarization states for a ‘massive gauge’ boson: 
we shall return to this later. 
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However, a degree of freedom (the Goldstone mode) cannot simply dis- 
appear. Somehow the system must keep track of the fact that we started 
with four degrees of freedom. To see what is going on, let us study the field 
equation for A’, namely 


Â — 0" (ð À") (19.43) 


= jea 


where 7", is the electromagnetic current contained in (19.40). This current 
can be obtained just as in (7.141), and is given by 


Jim = ialf 0" — (OGG) — 24? A"! Q. (19.44) 
We now insert the field parametrization (cf (17.84)) 


" 1 
p(z) = a 


into (19.44) where v/V/2 = 21/2||/A2 is the position of the minimum of the 
classical potential as a function of ||, as in (17.81). We obtain (problem 19.4) 


(v + h(x)) exp(—i8(a) /v) (19.45) 


4 a, 1078 
je —-vgq (x — ) +terms quadratic and cubic in the fields. (19.46) 
vq 


The linear part of the right-hand side of (19.46) is directly analogous to the 
non-relativistic current (19.26), interpreting Ê as essentially playing the role of 
$, and |v|? the role of v?. Retaining just the linear terms in (19.46) (the others 
would appear on the right-hand side of equation (19.47) following, where they 
would represent interactions), and placing this 7%, in (19.43), we obtain 


— 0*0, Â! = —v2¢? G e ") . (19.47) 
vq 


Now a gauge transformation on Â” has the form shown in (19.42), for arbitrary 
â. So we can certainly regard the whole expression (A” — 0"0/vq) as a perfectly 
acceptable gauge field. Let us define 


N 
ud 


(19.48) 


Then, since we know that the left-hand side of (19.47) is invariant under 
(19. 42), the resulting equation for A’” is 


Â” —8"9,À" = —?q? A”, (19.49) 


or 


a 

S 

Q 
Q 
Ş 
e» 
D 
= 

Il 
© 


(19.50) 


266 19. Spontaneously Broken Local Symmetry 


But (19.50) is nothing but the equation (19.8) for a free massive vector field, 
with mass M = vq! This fundamental observation was first made, in the 
relativistic context, by Englert and Brout (1964), Higgs (1964), and Guralnik 
et al. (1964); for a full account, see Higgs (1966). 

The foregoing analysis shows us two things. First, the current (19.46) is 
indeed a relativistic analogue of (19.26), in that it provides a ‘screening’ (mass 
generation) effect on the gauge field. Second, equation (19.48) shows how the 
phase degree of freedom of the Higgs field 9 has been incorporated into a new 
gauge field A7. which is massive, and therefore has ‘three’ spin degrees of 
freedom. In fact, we can go further. If we imagine plane wave solutions for 
A”, AY and Ô, we see that the 0"ÓJ vq part of (19.48) will contribute something 
proportional to k"/M to the polarization vector of A” (recall M = vq). But 
this is exactly the (large k) behaviour of the longitudinal polarization vector of 
a massive vector particle. We may therefore say that the massless gauge field 
AY has ‘swallowed’ the Goldstone field Ó via (19.48) to make the massive vector 
field Â”. The Goldstone field disappears as a massless degree of freedom, and 
reappears, via its gradient, as the longitudinal part of the massive vector field. 
In this way the four degrees of freedom are all now safely accounted for: three 
are in the massive vector field Â”, and one is in the real scalar field h (to 
which we shall return shortly). 

In this (relativistic) case, we know from Lorentz covariance that all the 
components (transverse and longitudinal) of the vector field must have the 
same mass, and this has of course emerged automatically from our covariant 
treatment. But the transverse and longitudinal degrees of freedom respond 
differently in the non-relativistic (superconductor) case. There, the longitu- 
dinal part of A couples strongly to longitudinal excitations of the electrons: 
primarily, as Bardeen (1957) first recognized, to the collective density fluctu- 
ation mode of the electron system — that is, to plasma oscillations. This is 
a high frequency mode, and is essentially the one discussed in section 17.3.2, 
after equation (17.46). When this aspect of the dynamics of the electrons is 
included, a fully gauge invariant description of the electromagnetic proper- 
ties of superconductors, within the BCS theory, is obtained (Schreiffer 1964, 
chapter 8). 

We return to equations (19.48)—(19.50). Taking the divergence of (19.50) 
leads, as we have seen, to the condition 


ü,À" =0 (19.51) 


on A“. It follows that in order to interpret the relation (19.48) as a gauge 
transformation on A” we must, to be consistent with (19.51), regard Â” as 
being in a gauge specified by 


A a ee 
eee cy eee e. (19.52) 
vq 


M 


In going from the situation described by A and Ê to one described by At 
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alone via (19.48), we have evidently chosen a gauge function (cf (19.42)) 
&(x) = —Ó(a)/v. (19.53) 
Recalling then the form of the associated local phase change on d(x) 
G(x) + 6 (a) = eG (zx) (19.54) 


we see that the phase of ¢ in (19.45) has been reduced to zero, in this choice 
of gauge. Thus it is indeed possible to ‘gauge 6 away’ in (19.45), but then 
the vector field we must use is A’, satisfying the massive equation (19.50) 
(ignoring other interactions). In superconductivity, the choice of gauge which 
takes the macroscopic wavefunction to be real (i.e. ¢ = 0 in (19.26)) is called 
the ‘London gauge’. In the next section we shall discuss a subtlety in the 
argument which applies in the case of real superconductors, and which leads 
to the phenomenon of flux quantization. 

The fact that this ‘Higgs mechanism’ leads to a massive vector field can 
be seen very economically by working in the particular gauge for which ¢ is 
real, and inserting the parametrization (cf (19.45)) 


^ 1 


^7 B 


into the Lagrangian Lu. Retaining only the terms quadratic in the fields one 
finds (problem 19.5) 


(v +h) (19.55) 


POSS ue -FOA — 0, À,)(0" AY — 9" Â!) + i 22 A AM 
+ 50, hà — ph. (19.56) 


The first line of (19.56) is exactly the Lagrangian for a spin-1 field of mass 
vq — ie. the Maxwell part with the addition of a mass term (note that the 
sign of the mass term is correct for the spatial (physical) degrees of freedom); 
the second line is the Lagrangian of a scalar particle of mass V2p. The latter 
is the mass of excitations of the Higgs field h away from its vacuum value 
(compare the global U(1) case discussed in section 17.5). The necessity for 
the existence of one or more massive scalar particles (‘Higgs bosons’), when 
a gauge symmetry is spontaneously broken in this way, was pointed out by 
Higgs (1964). 

We may now ask: what happens if we start with a certain phase 6 for ó 
but do not make use of the gauge freedom in A” to reduce Ê to zero? We shall 
see in section 19.5 that the equation of motion, and hence the propagator, 
for the vector particle depends om the choice of gauge; furthermore, Feynman 
graphs involving quanta corresponding to the degree of freedom associated 
with the phase field Ó will have to be included for a consistent theory, even 
though this must be an unphysical degree of freedom, as follows from the fact 
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that a gauge can be chosen in which this field vanishes. That the propagator 
is gauge dependent should, on reflection, come as a relief. After all, if the 
massive vector boson generated in this way were simply described by the 
wave equation (19.50), all the troubles with massive vector particles outlined 
in section 19.1 would be completely unresolved. As we shall see, a different 
choice of gauge from that which renders ó real has precisely the effect of 
ameliorating the bad high-energy behaviour associated with (19.50). This is 
ultimately the reason for the following wonderful fact: massive vector theories, 
in which the vector particles acquire mass through the spontaneous symmetry 
breaking mechanism, are renormalizable (’t Hooft 1971b). 

However, before discussing other gauges than the one in which $ is given 
by (19.55), we first explore another interesting aspect of superconductivity. 
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19.4 Flux quantization in a superconductor 


Though a slight diversion, it is convenient to include a discussion of flux quan- 
tization at this point, while we have a number of relevant results assembled. 
Apart from its intrinsic interest, the phenomenon may also provide a useful 
physical model for the ‘confining’ property of QCD, as already discussed in 
sections 1.3.6 and 16.5.3. 

Our discussion of superconductivity so far has dealt, in fact, with only 
one class of superconductors, called type I; these remain superconducting 
throughout the bulk of the material (exhibiting a complete Meissner effect), 
when an external magnetic field of less than a certain critical value is applied. 
There is a quite separate class — type II superconductors — which allow partial 
entry of the external field, in the form of thin filaments of flux. Within each 
filament the field is high, and the material is not superconducting. Outside the 
core of the filaments, the material is superconducting and the field dies off over 
the characteristic penetration length A. Around each filament of magnetic flux 
there circulates a vortex of screening current; the filaments are often called 
vortex lines. It is as if numerous thin cylinders, each enclosing flux, had been 
drilled in a block of type I material, thereby producing a non-simply connected 
geometry. 

In real superconductors, screening currents are associated with the macro- 
scopic pair wavefunction (field) v. For type II behaviour to be possible, |«| 
must vanish at the centre of a flux filament, and rise to the constant value 
appropriate to the superconducting state over a distance £ « A, where £ is the 
‘coherence length’ of section 17.7. According to the Ginzburg-Landau (GL) 
theory, a more precise criterion is that type II behaviour holds if € < 2!/?); 
both € and A are, of course, temperature-dependent. The behaviour of |y| and 
B in the vicinity of a flux filament is shown in figure 19.2. Thus, whereas for 
simple type I superconductivity, |v| is simply set equal to a constant, in the 
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FIGURE 19.2 
Magnetic field B and modulus of the macroscopic (pair) wavefunction |v| in 
the neighbourhood of a flux filament. 


type II case || has the variation shown in this figure. Solutions of the coupled 
GL equations for A and v can be obtained which exhibit this behaviour. 
An important result is that the flux through a vortex line is quantized. To 
see this, we write 
V = e*py| (19.57) 


as before. The expression for the electromagnetic current is 


2 
fiii (A " Ye) lul? (19.58) 


m 


as in (19.26), but in (19.58) we are leaving the charge parameter q undeter- 
mined for the moment; the mass parameter m will be unimportant. Rear- 
ranging, we have 

m . Vo 
= ———Jem + —- (19.59) 
qnod 
Let us integrate equation (19.59) around any closed loop C in the type II 
superconductor, which encloses a flux (or vortex) line. Far enough away from 
the vortex, the screening currents jem will have dropped to zero, and hence 


EXIT - JACET zu (19.60) 
G q Jc q 


where [¢]c is the change in phase around C. If the wavefunction v is single- 
valued, the change in phase [¢]c¢ for any closed path can only be zero or an 
integer multiple of 2z. Transforming the left-hand side of (19.60) by Stokes' 
Theorem, we obtain the result that the flux ® through any surface spanning 
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C is quantized: 


»- | Bas = =" = na, (19.61) 
q 


where ®o = 27/q is the flux equation (or 2rħ/q in ordinary units). It is not 
entirely self-evident why wv should be single-valued, but experiments do indeed 
demonstrate the phenomenon of flux quantization, in units of $9 with |g| = 2e 
(which may be interpreted as the charge on a Cooper pair, as usual). The phe- 
nomenon is seen in non-simply connected specimens of type I superconductors 
(i.e. ones with holes in them, such as a ring), and in the flux filaments of type 
II materials; in the latter case each filament carries a single flux quantum 65. 

It is interesting to consider now a situation — so far entirely hypothetical 
— in which a magnetic monopole is placed in a superconductor. Dirac showed 
(1931) that for consistency with quantum mechanics the monopole strength 
gm had to satisfy the ‘Dirac quantization conduction’ 


qgm = n/2 (19.62) 


where q is any electronic charge, and n is an integer. It follows from (19.62) 
that the flux 47gm out of any closed surface surrounding the monopole is 
quantized in units of $9. Hence a flux filament in a superconductor can 
originate from, or be terminated by, a Dirac monopole (with the appropriate 
sign of gm), as was first pointed out by Nambu (1974). 

This is the basic model which, in one way or another, underlies many 
theoretical attempts to understand confinement. The monopole-antimonopole 
pair in a type II superconducting vacuum, joined by a quantized magnetic flux 
filament, provides a model of a meson. As the distance between the pair — 
the length of the filament — increases, so does the energy of the filament, at a 
rate proportional to its length, since the flux cannot spread out in directions 
transverse to the filament. This is exactly the kind of linearly rising potential 
energy required by hadron spectroscopy (see equations (1.33) and (16.145)). 
The configuration is stable, because there is no way for the flux to leak away; 
it is a conserved quantized quantity. 

For the eventual application to QCD, one will want (presumably) par- 
ticles carrying non-zero values of the colour quantum numbers to be con- 
fined. These quantum numbers are the analogues of electric charge in the 
U(1) case, rather than of magnetic charge. We imagine, therefore, interchang- 
ing the roles of magnetism and electricity in all of the foregoing. Indeed, the 
Maxwell equations have such a symmetry when monopoles are present, as well 
as charges. The essential feature of the superconducting ground state was that 
it involved the coherent state formed by condensation of electrically charged 
bosonic fermion pairs. A vacuum which confined filaments of E rather than B 
may be formed as a coherent state of condensed magnetic monopoles (Man- 
delstam 1976, 't Hooft 1976). These E filaments would then terminate on 
electric charges. Now magnetic monopoles do not occur naturally as solutions 
of QED: they would have to be introduced by hand. Remarkably enough, 
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however, solutions of the magnetic monopole type do occur in the case of 
non-Abelian gauge field theories, whose symmetry is spontaneously broken 
to an electromagnetic U(1)em gauge group. Just this circumstance can arise 
in a grand unified theory which contains SU(3). and a residual U(1)em. In- 
cidentally, these monopole solutions provide an illuminating way of thinking 
about charge quantization: as Dirac (1931) pointed out, the existence of just 
one monopole implies, from his quantization condition (19.62), that charge is 
quantized. 

When these ideas are applied to QCD, E and B must be understood as 
the appropriate colour fields (i.e. they carry an SU(3). index). The group 
structure of SU(3) is also quite different from that of U(1) models, and we do 
not want to be restricted just to static solutions (as in the GL theory, here 
used as an analogue). Whether in fact the real QCD vacuum (ground state) 
is formed as some such coherent plasma of monopoles, with confinement of 
electric charges and flux, is a subject of continuing research; other schemes are 
also possible. As so often stressed, the difficulty lies in the non-perturbative 
nature of the confinement problem. 


İG Un ———————————————— 


19.5 ^t Hooft’s gauges 


We must now at last grasp the nettle and consider what happens if, in the 
parametrization 


6 = |S] expliĝ(z)/v) (19.63) 
we do not choose the gauge (cf (19.52)) 


ð A" = 06/M. (19.64) 


This was the gauge that enabled us to transform away the phase degree of 
freedom and reduce the equation of motion for the electromagnetic field to 
that of a massive vector boson. Instead of using the modulus and phase as 
the two independent degrees of freedom for the complex Higgs field ob, we now 
choose to parametrize à, quite generally, by the decomposition 


à = 27V? lv + $1(z) + ixe(2)], (19.65) 


where the vacuum values of X1 and X2 are zero. Substituting this form for b 
into the master equation for A” (obtained from (19.43) and (19.44)) 


AY — 8" (O ÂH) = ig[ó* 0" — (o"t G] — 2@ À" Gt, (19.66) 
leads to the equation of motion 

(A+ M?)À" — 8"(,À") = —M"$»-- q(X20" 31 — X10" Ra) 

- P Á (Xi 2v$1 $3) (19.67) 
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FIGURE 19.3 
A” — X2 coupling. 


with M — qv. At first sight this just looks like the equation of motion of 
an ordinary massive vector field Â” coupled to a rather complicated current. 
However, this certainly cannot be right, as we can see by a count of the degrees 
of freedom. In the previous gauge we had four degrees of freedom, counted 
either as two for the original massless A” plus one each for Ê and h, or as three 
for the massive À " and one for h. If we take this new equation at face value, 
there seem to be three degrees of freedom for the massive field A”, and one for 
each of gı and 2, making five in all. Actually, we know perfectly well that 
we can make use of the freedom gauge choice to set Y» to zero, say, reducing 
" to a real quantity and eliminating a spurious degree of freedom: we have 
then returned to the form (19.55). In terms of (19.67), the consequence of the 
unwanted degree of freedom is quite subtle, but it is basic to all gauge theories 
and already appeared in the photon case, in section 7.3.2. The difficulty arises 
when we try to calculate the propagator for Â” from equation (19.67). 

The operator on the left-hand side can be simply inverted, as was done in 
section 19.1, to yield (apparently) the standard massive vector boson propa- 
gator 

i(—9"" + k"k"/M?)/(k? — M?). (19.68) 


However, the current on the right-hand side of (19.67) is rather peculiar: in- 
stead of having only terms corresponding to AY coupling to two or three par- 
ticles, there is also a term involving only one field. This is the term — MO" Xa, 
which tells us that A” actually couples directly to the scalar field x2 via 
the gradient coupling (— MO"). In momentum space this corresponds to a 
coupling strength —ik” M and an associated vertex as shown in figure 19.3. 
Clearly, for a scalar particle, the momentum 4-vector is the only quantity 
that can couple to the vector index of the vector boson. The existence of 
this coupling shows that the propagators of A" and X» are necessarily mixed: 
the complete vector propagator must be calculated by summing the infinite 
series shown diagrammatically in figure 19.4. This complication is, of course, 
completely eliminated by the gauge choice Yo = 0. However, we are interested 
in pursuing the case X2 Æ 0. 

In figure 19.4 the only unknown factor is the propagator for 2. This can 
be easily found by substituting (19.65) into £y and examining the part which 
is quadratic in the Y's; we find (problem 19.6) 


A 1 1 
Ly= 5 X10" X1 + 59. X20" € — u? X2 + cubic and quartic terms. (19.69) 
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FIGURE 19.4 | 
Series for the full A" propagator. 
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FIGURE 19.5 
Formal summation of the series in figure 19.4. 


Equation (19.69) confirms that X, is a massive field with mass V2 (like the 
h in (19.56)), while X2 is massless. The 2 propagator is therefore i/k?. Now 
that all the elements of the diagrams are known, we can formally sum the 
series by generalizing the well known result ((cf 10.12)and (11.27)) 


(=a) '!21-42T2^-- 2? +... (19.70) 


Diagrammatically, we rewrite the propagator of figure 19.4 as in figure 19.5 and 
perform the sum. Inserting the expressions for the propagators and vector- 
scalar coupling, and keeping track of the indices, we finally arrive at the result 
(problem 19.7) 

: —g^ + k^k^ /M? v v = 


for the full propagator. But the inverse required in (19.71) is precisely (with a 
lowered index) the one we needed for the photon propagator in (7.91) — and, 
as we saw there, it does not exist. At last the fact that we are dealing with a 
gauge theory has caught up with us! 

As we saw in section 7.3.2, to obtain a well-defined gauge field propagator 
we need to fix the gauge. A clever way to do this in the present (spontaneously 
broken) case was suggested by ’t Hooft (1971b). His proposal was to set 


8, AY = M&S» (19.72) 


where £ is an arbitrary gauge parameter! (not to be confused with the su- 
perconducting coherence length). This condition is manifestly covariant, and 
moreover it effectively reduces the degrees of freedom by one. Inserting (19.72) 


1We shall not enter here into the full details of quantization in such a gauge: we shall 
effectively treat (19.72) as a classical field relation. 
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into (19.67) we obtain 


(+ M?)À" — 0"(8,À")(1—1/£) = a(Xa0"$i— $10" X») 
— PA’ ($1 + 2X1 + X2). (19.73) 


The operator appearing on the left-hand side now does have an inverse (see 
problem 19.8) and yields the general form for the gauge boson propagator 


(1 = £)k^ k" 


EP (k? - M°). (19.74) 


i|—-g"" + 


This propagator is very remarkable?. The standard massive vector boson 
propagator 


i(—g"" + kk" /M?)(k? — M?)~* (19.75) 


is seen to correspond to the limit £ — oo, and in this gauge the high-energy 
disease outlined in section 19.1 appears to threaten renormalizability (in fact, 
it can be shown that there is a consistent set of Feynman rules for this gauge, 
and the theory is renormalizable thanks to many cancellations of divergences). 
For any finite €, however, the high-energy behaviour of the gauge boson prop- 
agator is actually ~ 1/k?, which is as good as the renormalizable theory of 
QED (in Lorentz gauge). Note, however, that there seems to be another pole 
in the propagator (19.74) at k? = €M?: this is surely unphysical since it de- 
pends on the arbitrary parameter €. A full treatment (’t Hooft 1971b) shows 
that this pole is always cancelled by an exactly similar pole in the propagator 
for the x» field itself. These finite-€ gauges are called R gauges (since they 
are ‘manifestly renormalizable’) and typically involve unphysical Higgs fields 
such as x». The infinite-€ gauge is known as the U gauge (U for unitary) since 
only physical particles appear in this gauge. For tree diagram calculations, of 
course, it is easiest to use the U gauge Feynman rules: the technical difficulties 
with this gauge choice only enter in loop calculations, for which the R gauge 
choice is easier. 

Notice that in our master formula (19.74) for the gauge boson propagator 
the limit M — 0 may be safely taken (compare the remarks about this limit 
for the ‘naive’ massive vector boson propagator in section 19.1). This yields 
the massless vector boson (photon) propagator in a general €-gauge, exactly 
as in equation (7.122) or (19.23). 

We now proceed with the generalization of these ideas to the non-Abelian 
SU(2) case, which is the one relevant to the electroweak theory. The general 
non-Abelian case was treated by Kibble (1967). 


2 A vector boson propagator of similar form was first introduced by Lee and Yang (1962), 
but their discussion was not within the framework of a spontaneously broken theory, so that 
Higgs particles were not present, and the physical limit was obtained only as € — 0. 
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19.6 Spontaneously broken local SU(2)xU(1) symmetry 


We shall limit our discussion of the spontaneous breaking of a local non- 
Abelian symmetry to the particular case needed for the electroweak part of the 
Standard Model. This is, in fact, just the local version of the model studied 
in section 17.6. As noted there, the Lagrangian £a of (17.97) is invariant 
under global SU(2) transformations of the form (17.100), and also global U(1) 
transformations (17.101). Thus in the local version we shall have to introduce 
three SU(2) gauge fields (as in section 13.1), which we call WH (x) (i = 1, 2,3), 
and one U(1) gauge field B!(x). We recall that the scalar field ¢ is an SU(2)- 


doublet . 
^ ot 
gem. m 


so that the SU(2) covariant derivative acting on ¢ is as given in (13.10), namely 
Db" = 9" +igr - W" /2. (19.77) 


To this must be added the U(1) piece, which we write as ig’ B" /2, the i being 
for later convenience. The Lagrangian (without gauge-fixing and ghost terms) 
is therefore 


Écs = (DO) (D^) + P9! 6 — 30816)? — Lb, E" — LG, 0" (19.78) 
where 
Di (9^ +igr - W" /2 + ig B" /2)d, 19.79) 
BU w” —o"W" —9W" x W', 19.80) 
and : . . 
Gey = on Bv — o" Be. 19.81) 


We must now decide how to choose the non-zero vacuum expectation value 
that breaks this symmetry. The essential point for the electroweak applica- 
tion is that, after symmetry breaking, we should be left with three massive 
boson gauge bosons (which will be the W* and Z°) and one massless gauge 
boson, the photon. We may reasonably guess that the massless boson will 
be associated with a symmetry that is unbroken by the vacuum expectation 
value. Put differently, we certainly do not want a ‘superconducting’ massive 
photon to emerge from the theory in this case, as the physical vacuum is not 
an electromagnetic superconductor. This means that we do not want to give a 
vacuum value to a charged field (as is done in the BCS ground state). On the 
other hand, we do want it to behave as a ‘weak’ superconductor, generating 
mass for W^ and Z°. The choice suggested by Weinberg (1967) was 


o= a) (19.82) 
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where v/4/2 = V2u/\'/?, which we already considered in the global case in 
section 17.6. As pointed out there, (19.82) implies that the vacuum remains 
invariant under the combined transformation of ‘U(1) + third component of 
SU(2) isospin’ - that is, (19.82) implies 


(5 4 e) (0|4|0) = 0 (19.83) 
and hence 
tollo) -» Kolo) = exp fia (38) | oo = oo, 29.84) 


1/2 ; : ; fe 
where as usual th res 73/2 (we are using lowercase t for isospin now, antici- 


pating that it is the weak, rather than hadronic, isospin — see chapter 21). 
We now need to consider oscillations about (19.82) in order to see the 

physical particle spectrum. As in (17.107) we parametrize these conveniently 
as 

n » 0 

$ = exp(—1i0(x) - T/2v) Lw + Ha) (19.85) 
(compare (19.45)). However this time, in contrast to (17.107) but just as in 
(19.55), we can reduce the phase fields @ to zero by an appropriate gauge 
transformation, and it is simplest to examine the particle spectrum in this 
(unitary) gauge. Substituting 


" 0 
$ó-— Lw + f(a) ) (19.86) 


into (19.78) and retaining only terms which are second order in the fields (i.e. 
kinetic energies or mass terms) we find (problem 19.9) 


^ 1x x x 
fire = zu HOM H — pH 


1 ^ ^ A ^ 1 à és 
- 3 (8, i, — in, (Wr — OWE) + zt Wa, WE 


- (Way — 0, (MWY — PWH) + cea 

- 4G, — 0, (OWY — PWE) - 16,00 

+ gu Wa, — of By)(gWS — 9 BY, (19.87) 
The first line of (19.87) tells us that we have a scalar field of mass V2u (the 


Higgs boson, again). The next two lines tell us that the components Wi and 
W» of the triplet (W1, W2, W3) acquire a mass (cf (19.56) in the U(1) case) 
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The last two lines show us that the fields W3 and B are mixed. But they 
can easily be unmixed by noting that the last term in (19.87) involves only 
the combination gw =g ‘Be, which evidently acquires a mass. This suggests 
introducing the normalized lined combination 


Z" = cos0wW/ — sin Ow B" (19.89) 
where 
cos&w — g/(g? +97)? — sindw=g'/(g? +97)”, (19.90) 
together with the orthogonal combination 


Â! = sinOwW + cos OwB". (19.91) 


We then find that the last two lines of (19.87) become 


v v r7 "7 P l; uv 
-40. y — 9,£,)(0,Z" — OZ") + -v?(g? - g 2) Z,Z" — ru , (19.92) 
where : n " 
Fu) = 0,Ày — 0,A,. 19.93) 
Thus 1 
Mz = sug +97)'/? = Mw/cosOw 19.94) 
and 
Ma = 0. 19.95) 


Counting degrees of freedom as in the local U(1) case, we originally had 12 in 
(19.78) — three massless W’s and one massless B, which is 8 degrees of freedom 
in all, together with 4 be fields, all with the same mass. After symmetry 
brenis we have three massive vector fields WA, We and Z with 9 degrees 
of freedom, one massless vector field A with 2, and one massive scalar H. 
Of course, the physical application will be to identify the W and Z fields 
with those physical particles, the A field with the massless photon, and the H 
field with the Higgs boson. In the gauge (19.86), the W and Z particles have 
propagators of the form (19.22). 

The identification of A" with the photon field is made clearer if we look 
at the form of Did written in terms of A, and Za discarding the WA, We 
pieces: 


3 1 . 
Did. = fo, + igsin de ( $54, 


ig ES dw (=>) Zn (19.96) 


cos Ow 


Now the operator (1+ 73) acting on (0|¢|0) gives zero, as observed in (19.83), 


278 19. Spontaneously Broken Local Symmetry 


and this is why A,, does not acquire a mass when (0|¢|0) 4 0 (gauge fields 
coupled to unbroken symmetries of (0|¢|0) do not become massive). Although 
certainly not unique, this choice of ¢ and (0|d]0) is undoubtedly very econom- 
ical and natural. We are interpreting the zero eigenvalue of (1 + 73) as the 
electromagnetic charge of the vacuum, which we do not wish to be non-zero. 
We then make the identification 


e = gsinOw (19.97) 


in order to get the right ‘electromagnetic D,,’ in (19.96). 

We emphasize once more that the particular form of (19.87) corresponds 
to a choice of gauge, namely the unitary one (cf the discussions in sections 
19.3 and 19.5). There is always the possibility of using other gauges, as in 
the Abelian case, and this will in general be advantageous when doing loop 
calculations involving renormalization. We would then return to a general 
parametrization such as (cf (19.65) and (17.95)) 


7 0 1 / d2- iĝi ) 
= += à : 19.98 
a (ava) 5 6 — ids E 
and add 't Hooft gauge-fixing terms 


-xl 5 (3 WË + €Mwóiy + (3 Ê! + £Mzós)? + a, (19.99) 
i=1,2 


In this case the gauge boson propagators are all of the form (19.74), and €- 
dependent. In such gauges, the Feynman rules will have to involve graphs 
corresponding to exchange of quanta of the ‘unphysical’ fields ĝi, as well as 
those of the physical Higgs scalar 6. These will also have to be suitable ghost 
interactions in the non-Abelian sector as discussed in section 13.3.3. The 
complete Feynman rules of the electroweak theory are given in Appendix B 
of Cheng and Li (1984), for example. 

The model introduced here is actually the ‘Higgs sector’ of the Standard 
Model, but without any couplings to fermions. We have seen how, by sup- 
posing that the potential in (19.78) has the symmetry-breaking sign of the 
parameter u?, the W* and Z° gauge bosons can be given masses. This seems 
to be an ingenious and even elegant ‘mechanism’ for arriving at a renormal- 
izable theory of massive vector bosons. One may of course wonder whether 
this ‘mechanism’ is after all purely phenomenological, somewhat akin to the 
GL theory of a superconductor. In the latter case, we know that it can be 
derived from ‘microscopic’ BCS theory, and this naturally leads to the ques- 
tion whether there could be a similar underlying ‘dynamical’ theory, behind 
the Higgs sector. It is, in fact, quite simple to construct a theory in which the 
Higgs fields db appear as bound, or composite, states of heavy fermions. 

But generating masses for the gauge bosons is not the only job that the 
Higgs sector does, in the Standard Model: it also generates masses for all 
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the fermions. As we will see in chapter 22, the gauge symmetry of the weak 
interactions is a chiral one, which requires that there should be no explicit 
fermion masses in the Lagrangian. We saw in chapter 18 how there is good 
evidence that the strong QCD interactions break chiral symmetry sponta- 
neously, but that there is also a need for small Lagrangian masses for the 
quarks, which break chiral symmetry explicitly (so as to give mass to the 
pions, for example). The leptons are of course not coupled to QCD, and 
we have to assume Lagrangian masses for them too. Thus for both quarks 
and leptons chiral-symmetry-breaking mass terms seem to be required. The 
only way to preserve the weak chiral gauge symmetry is to assume that these 
fermion masses must, in their turn, be interpreted as arising ‘spontaneously’ 
also; that is, not via an explicit mass term in the Lagrangian. The dynamical 
generation of quark and lepton masses would, in fact, be closely analogous 
to the generation of the energy gap in the BCS theory, as we saw in section 
18.1. So we may ask: is it possible to find a dynamical theory which gener- 
ates masses for both the vector bosons, and the fermions? Such theories are 
generically known as ‘technicolour models’ (Weinberg 1979b, Susskind 1979), 
and they have been intensively studied (see, for example, Peskin 1997). One 
problem is that such theories are already tightly constrained by the precision 
electroweak experiments (see chapter 22), and meeting these constraints seems 
to require rather elaborate kinds of models. However, technicolour theories do 
offer the prospect of a new strongly interacting sector, which could possibly 
be probed at the LHC. But such ideas take us beyond the scope of the present 
volume. Within the Standard Model, one proceeds along what seems a more 
phenomenological route, attributing the masses of fermions to their couplings 
with the Higgs field, in a way that will be explained in chapter 22: briefly, the 
couplings have the Yukawa form gr f f ĝ, so that when ó develops a vev v, the 
fermions gain a mass m = grv. 

We now turn, in the last part of the book, to weak interactions and the 
electroweak theory. 


ÁÁ]. —————————— 
Problems 


19.1 Show that 


v v g + kyk /M? v 
CP p MP ge a ppr) (e EP) — ay 


19.2 Verify (19.18). 
19.3 Verify (19.19). 


19.4 Verify (19.46). 
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19.5 Insert (19.55) into Ly of (19.40) and derive (19.56) for the quadratic 
terms. 


19.6 Insert (19.65) into Ly of (19.40) and derive the quadratic terms of 
(19.69). 


19.7 Derive (19.71). 


19.8 Write the left-hand side of (19.73) in momentum space (as in (19.4)), 


and show that the inverse of the factor multiplying A“ is (19.74) without the 
^ (cf problem 19.1). 


19.9 Verify (19.87). 


Part VIII 


Weak Interactions and the 
Electroweak ‘Theory 
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Introduction to the Phenomenology of Weak 
Interactions 


Public letter to the group of the Radioactives at the district society meeting 
in Tubingen: 


Physikalisches Institut 

der Eidg. Technischen Hochschule 
Gloriastr. 

Zürich 


Zürich, 4. Dec. 1930 


Dear Radioactive Ladies and Gentlemen, 


As the bearer of these lines, to whom I graciously ask you to listen, will 
explain to you in more detail, how because of the ‘wrong’ statistics of 
the N and 9Li nuclei and the continuous f-spectrum, I have hit upon a 
desperate remedy to save the ‘exchange theorem’ of statistics and the law 
of conservation of energy. Namely, the possiblity that there could exist in 
the nuclei electrically neutral particles, that I wish to call neutrons, which 
have the spin i and obey the exclusion principle and which further differ 
from light quanta in that they do not travel with the velocity of light. 
'The mass of the neutrons should be of the same order of magnitude as 
the electron mass and in any event not larger than 0.01 proton masses. 
— The continuous -spectrum would then become understandable by the 
assumption that in 6-decay, a neutron is emitted in addition to the electron 
such that the sum of the energies of the neutron and electron is constant. 


I admit that on a first look my way out might seem to be quite unlikely, 
since one would certainly have seen the neutrons by now if they existed. 
But nothing ventured nothing gained, and the seriousness of the matter 
with the continuous $-spectrum is illustrated by a quotation of my hon- 
oured predecessor in office, Mr. Debye, who recently told me in Brussels: 
‘Oh, it is best not to think about it, like the new taxes.’ Therefore one 
should earnestly discuss each way of salvation. — So, dear Radioactives, 
examine and judge it. — Unfortunately I cannot appear in Tübingen per- 
sonally, since I am indispensable here in Zürich because of a ball on the 


283 


284 20. Introduction to the Phenomenology of Weak Interactions 


night of 6/7 December. — With my best regards to you, and also Mr. 
Back, your humble servant, 


W. Pauli 


Quoted from Winter (2000), pages 4—5. 


At the end of the previous chapter we arrived at an important part of the 
Lagrangian of the Standard Model, namely the terms involving just the gauge 
and Higgs fields. The full electroweak Lagrangian also includes, of course, the 
couplings of these fields to the quarks and leptons. We could at this point sim- 
ply write these couplings down, with little motivation, and proceed at once to 
discuss the empirical consequences. But such an approach, though economi- 
cal, would assume considerable knowledge of weak interaction phenomenology 
on the reader’s part. We prefer to keep this book as self-contained as possible, 
and so in the present chapter we shall provide an introduction to this phe- 
nomenology, following a ‘semi-historical’ route (for fuller historical treatments 
we refer the reader to Marshak et al. 1969, or to Winter 2000, for example). 

Much of what we shall discuss is still, for many purposes, a very useful 
approximation to the full theory at energies well below the masses of the WF 
(~80 GeV) and Z? (~90 GeV). The reason for this is that in the electroweak 
theory (chapter 22), tree-level amplitudes have a structure very similar to that 
in the purely electromagnetic case, namely (see equation (8.101)) 


j^ (Suv dude / MS z) 1d (20.1) 
Jwk q? 7 Me z Jwk 7 
where j^, is a weak current, and we are using (19.75) for the propagator of 
the exchanged W or Z bosons. For q? < Mwy z (20.1) becomes proportional 
to the product of two currents; this (current-current! form was for many years 
the basis of weak interaction phenomenology, as we now describe. 


E 


20.1 Fermi’s ‘current—current’ theory of nuclear 5-decay, 
and its generalizations 


The first quantum field theory of a weak interaction process was proposed 
by Fermi (1934a,b) for nuclear 6-decay, building on the ‘neutrino hypothesis’ 
of Pauli. In 1930, Pauli (in his ‘Dear Radioactive Ladies and Gentlemen’ 
letter) had suggested that the continuous e^ spectrum in (-decay could be 
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FIGURE 20.1 
Four-fermion interaction for neutron $-decay. 


understood by supposing that, in addition to the e~, the decaying nucleus 
also emitted a light, spin-3, electrically neutral particle, which he called the 
‘neutron’. In this first version of the proposal, Pauli regarded his hypothetical 
particle as a constituent of the nucleus. This had the attraction of solving not 
only the problem with the continuous e~ spectrum, but a second problem as 
well — what he called the ‘wrong’ statistics of the !4N and 9Li nuclei. Taking 
14N for definiteness, the problem was as follows. Assuming that the nucleus 
was somehow composed of the only particles (other than the photon) known 
in 1930, namely electrons and protons, one requires 14 protons and 7 electrons 
for the known charge of 7. This implies a half-odd integer value for the total 
nuclear spin. But data from molecular spectra indicated that the nitrogen 
nuclei obeyed Bose-Einstein, not Fermi-Dirac statistics, so that — if the usual 
‘spin-statistics’ connection were to hold — the spin of the nitrogen nucleus 
should be an integer, not a half-odd integer. This second part of Pauli's 
hypothesis was quite soon overtaken by the discovery of the (real) neutron by 
Chadwick (1932), after which it was rapidly accepted that nuclei consisted of 
protons and (Chadwick's) neutrons. 

However, the 6-spectrum problem remained, and at the Solvay Confer- 
ence in 1933 Pauli restated his hypothesis (Pauli 1934), using now the name 
‘neutrino’ which had meanwhile been suggested by Fermi. Stimulated by the 
discussions at the Solvay meeting, Fermi then developed his theory of £-decay. 
In the new picture of the nucleus, neither the electron nor the neutrino were to 
be thought of as nuclear constituents. Instead, the electron-neutrino pair had 
somehow to be created and emitted in the transition process of the nuclear 
decay, much as a photon is created and emitted in nuclear y-decay. Indeed, 
Fermi relied heavily on the analogy with electromagnetism. The basic process 
was assumed to be the transition neutron proton, with the emission of an 
er pair, as shown in figure 20.1. The n and p were then regarded as ‘ele- 
mentary’ and without structure (point-like); the whole process took place at a 
single space-time point, like the emission of a photon in QED. Further, Fermi 
conjectured that the nucleons participated via a weak interaction analogue of 
the electromagnetic transition currents frequently encountered in volume 1 for 
QED. In this case, however, rather than having the ‘charge conserving’ form 
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of üy^/"uy for instance, the ‘weak current’ had the form tpy"un, in which the 
charge of the nucleon changed. The lepton pair was also charged, obviously. 
The whole interaction then had to be Lorentz invariant, implying that the 
e vy pair had also to appear in a similar (4-vector) ‘current’ form. Thus a 
‘current—current’ amplitude was proposed, of the form 


Alp Unte- YU; (20.2) 


where A was a constant. Correspondingly, the process was described field 
theoretically in terms of the local interaction density 


A, (2)! bs OAOA AC (20.3) 


The discovery of positron -decay soon followed, and then of electron capture; 
these processes were easily accommodated by adding to (20.3) its Hermitian 
conjugate 


Av, (2) y vp (x)v, Gr) be (x), (20.4) 
taking A to be real. The sum of (20.3) and (20.4) gave à good account of 
many observed characteristics of f-decay, when used to calculate transition 
probabilities in first-order perturbation theory. 

Soon after Fermi's theory was presented, however, it became clear that the 
observed selection rules in some nuclear transitions could not be accounted 
for by the forms (20.3) and (20.4). Specifically, in ‘allowed’ transitions (where 
the orbital angular momentum carried by the leptons is zero) it was found 
that, while for many transitions the nuclear spin did not change (AJ — 0), 
for others — of comparable strength — a change of nuclear spin by one unit 
(AJ — 1) occurred. Now, in nuclear decays the energy release is very small 
(~ few MeV) compared to the mass of a nucleon, and so the non-relativistic 
limit is an excellent approximation for the nucleon spinors. It is then easy to 
see (problem 20.1) that, in this limit, the interactions (20.3) and (20.4) imply 
that the nucleon spins cannot ‘flip’. Hence some other interaction(s) must 
be present. Gamow and Teller (1936) introduced the general four-fermion 
interaction, constructed from bilinear combinations of the nucleon pair and of 
the lepton pair, but not their derivatives. For example, the combination 


Dp (a) bn (x)be (a) (a) (20.5) 
could occur, and also 
dp(x)optn(a)bo"” tr (a) (20.6) 
where : 
Ope = 5 (ufo — Wy): (20.7) 


The non-relativistic limit of (20.5) gives AJ = 0, but (20.6) allows AJ = 1. 
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Other combinations are also possible, as we shall discuss shortly. Note that 
the interaction must always be Lorentz invariant. 

Thus began a long period of difficult experimentation to establish the 
correct form of the 8-decay interaction. With the discovery of the muon (in 
1937) and the pion (ten years later) more weak decays became experimentally 
accessible, for example u decay 


po ae +ut+v (20.8) 


and 7 decay 
T +e cv. (20.9) 


Note that we have deliberately called all the neutrinos just ‘v’, without any 
particle/antiparticle indication, or lepton flavour label; we shall have more to 
say on these matters in section 20.3. T'here were hopes that the couplings of 
the pairs (p,n), (, e^) and (v, u~) might have the same form (‘universality’) 
but the data was incomplete, and in part apparently contradictory. 

The breakthrough came in 1956, when Lee and Yang (1956) suggested that 
parity was not conserved in all weak decays. Hitherto, it had always been as- 
sumed that any physical interaction had to be such that parity was conserved, 
and this assumption had been built into the structure of the proposed £-decay 
interactions, such as (20.3), (20.5) or (20.6). Once it was looked for properly, 
following the analysis of Lee and Yang, parity violation was indeed found to 
be a strikingly evident feature of weak interactions. 


E ETETLLETÉTTEÉTTTELTLTETIÉÉETIITIIIIIIIIIÍIIIÍIÍÍZ 


20.2 Parity violation in weak interactions, 
and V-A theory 


20.2.1  Parity violation 


In 1957, the experiment of Wu et al. (1957) established for the first time that 
parity was violated in a weak interaction, specifically nuclear £-decay. The 
experiment involved a sample of 9?Co (J = 5) cooled to 0.01 K in a solenoid. 
At this temperature most of the nuclear spins are aligned by the magnetic field, 
and so there is a net polarization (J), which is in the direction opposite to 
the applied magnetic field. 99Co decays to °°Ni (J = 4), a AJ = 1 transition. 
The degree of 99Co alignment was measured from observations of the angular 
distribution of 4-rays from Ni. The relative intensities of electrons emitted 
along and against the magnetic field direction were measured, and the results 
were consistent with a distribution of the form 


I(0) = 1-(J)-p/E (20.10) 
= 1-— Pvcos0 (20.11) 
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where v, p and E are respectively the electron speed, momentum and energy, 
P is the magnitude of the polarization, and 0 is the angle of emission of the 
electron with respect to (J). 

Why does this indicate parity violation? 'To see this, we recall from the 
discussion of the parity operation P in section 4.2.1 that the angular momen- 
tum J is an axial vector such that (J) — (J) under P, while p is a polar 
vector transforming by p — —p. Hence, in the parity-transformed system, 
the distribution (20.11) would have the form 


Ip(0)— 1 + Pvcos0 (20.12) 


The difference between (20.12) and (20.11) implies that, by performing the 
measurement, we can determine which of the two coordinate systems we must 
in fact be using. The two are inequivalent, in contrast to all the other coordi- 
nate system equivalences which we have previously studied (e.g. under three- 
dimensional rotations, and Lorentz transformations). This is an operational 
consequence of *parity violation'. The crucial point in this example, evidently, 
is the appearance of the pseudoscalar quantity (J) -p in (20.10), alongside the 
obviously scalar quantity ‘1’. 

'The Fermi theory, employing only vector currents, needs a modification 
to accommodate this result. We saw in section 4.2.1 that a combination of 
vector (‘V’) and axial vector (‘A’) currents would be parity-violating. Indeed, 
after many years of careful experiments, and many false trails, it was even- 
tually established (always, of course, to within some experimental error) that 
the currents participating in Fermi's current-current interaction are, in fact, 
certain combinations of V-type and A-type currents, for both nucleons and 
leptons. 


20.2.2 V-A theory: chirality and helicity 


Quite soon after the discovery of parity violation, Sudarshan and Marshak 
(1958), and then Feynman and Gell-Mann (1958) and Sakurai (1958), pro- 
posed a specific form for the current-current interaction, namely the V-A 
(*V minus A") structure. For example, in place of the leptonic combination 
üs-"y,u,, these authors proposed the form ü,-y,(1 — ys)uv, being the differ- 
ence (with equal weight) of a V-type and an A-type current. For the part 
involving the nucleons the proposal was slightly more complicated, having the 
form üpy,(l— rys)un where r had the empirical value r ~ 1.2. From our 
present perspective, of course, the hadronic transition is actually occurring at 
the quark level, so that rather than a transition n > p we now think in terms 
of a d — u one. In this case, the remarkable fact is that the appropriate cur- 
rent to use is, once again, essentially the simple ‘V-A’ one, uY (1 — *s)ual. 
This V-A structure for quarks and leptons is fundamental to the Standard 
Model. 


1 We shall see in section 20.7 that a slight modification is necessary. 
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We must now at once draw the reader’s attention to a rather remarkable 
feature of this V-A structure, which is that the (1 — y5) factor can be thought 
of as acting either on the u spinor or on the à spinor. Consider, for example, 
a term ü,-y,(1— ys)u,. We have 


^fs)tue-]! Byte, 
= Y5 JUe-] Vuur. (20.13) 


To understand the significance of this, it is advantageous to work in the rep- 
resentation (3.40) of the Dirac matrices, in which ys is diagonal, namely 


1 0 c 0 0 1 0 -o 
poorer 9) nea) ale a) 
(20.14) 
Readers who have not worked through problem 9.4 might like to do so now; 
we may also suggest a backward glance at section 12.4.2 and chapter 17. 


First of all it is clear that any combination ‘(1 — y5)w’ is an eigenstate of 
y5 with eigenvalue —1: 


*5(1— ys)u = (ys — 1)u = — (1 — 95) u (20.15) 


using y = 1. In the terminology of section 12.4.2, ‘(1 — y5)u’ has definite 
chirality, namely L (‘left-handed’), meaning that it belongs to the eigenvalue 
—] of y5. We may introduce the projection operators Pg, Pr, of section 12.4.2, 


P= (=) PR = (=) (20.16) 


satisfying 
PÈ = PR P? = P, PrP, = PPR = 0 Pg +P =1, (20.17) 


and define 
uL = Pu, Ug = Pnu (20.18) 


for any u. Then 


B 1-75 eo E 2 
UY 5 u2 =  uyyyPLus = Uryu Pru» 


= üyyQPLuar = U1 Prypuer 
= ul PL Byun = üipyuuaL (20.19) 
which formalizes (20.13) and emphasizes the fact that only the chiral L com- 


ponents of the u spinors enter into weak interactions, a remarkably simple 
statement. 
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To see the physical consequences of this, we need the forms of the Dirac 
spinors in this new representation, which we shall now derive explicitly, for 
convenience. As usual, positive energy spinors are defined as solutions of 


(Ø — m)u = 0, so that writing 
(69 
= ( X ) (20.20) 


we obtain 


(E—-co.p)ó = mx 
(E+oa-p)x = mẹ. (20.21) 


A convenient choice of 2-component spinors ¢, x is to take them to be helicity 
eigenstates (see section 3.3). For example, the eigenstate ¢4 with positive 
helicity A = +1 satisfies 

o : po+ = |p|ó« (20.22) 
while the eigenstate ó.. with A = —1 satisfies (20.22) with a minus on the 
right-hand side. Thus the spinor u(p, A = +1) can be written as 


u(pjÀ—-1)2 N ( me i ) . (20.23) 


The normalization N is fixed as usual by requiring wu = 2m, from which it 
follows (problem 20.2) that N = (E + |p|) ?. Thus finally we have 


u(p, A = +1) = ( vee) (20.24) 


Similarly 
2 opaí( VE- Irl- 
u(p,’\ = —1) = ( ES ) ; (20.25) 


Now we have agreed that only the chiral ‘L’ components of all u-spinors 
enter into weak interactions, in the Standard Model. But from the explicit 
form of y5 given in (20.14), we see that when acting on any spinor u, the 
projector P; ‘kills’ the top two components: 


B e (20.26) 


Pulp, A = +1) =( amp, ) (20.27) 


In particular 


and 


PGs =i) = ( Jett Ji (20.28) 
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Equations (20.27) and (20.28) are very important. In particular, equation 
(20.27) implies that in the limit of zero mass m (and hence E — |p|), only 
the negative helicity u-spinor will enter. More quantitatively, using 


for m « E, (20.29) 


we can say that positive helicity components of all fermions are suppressed in 
V-A matrix elements, relative to the negative helicity components, by factors 
of order (m/ E). Bearing in mind that the helicity operator ø - p/|p| is a 
pseudoscalar, this ‘unequal’ treatment for A = +1 and A = —1 components is, 
of course, precisely related to the parity violation built in to the V-A structure. 


A similar analysis may be done for the v-spinors. They satisfy (f--m)v = 0 


and the normalization vv = —2m. We must however remember the ‘small 
subtlety’ to do with the labelling of v-spinors, discussed in section 3.4.3: the 
2-component spinors x- in v(p, à = +1) actually satisfy ø - px- = —|ply_, 


and similarly the y+’s in v(p, A = —1) satisfy ø - px4 = |p|x... We then find 
(problem 20.3) the results 


-VE - |plx- ) 
,A-—4H)-2 20.30 
i i ( E+ Ipix- din 
and 
E + |plx«- ) 
v(A—-1)— . 20.31 
l ) ( -vE - |p|x+ l ) 


Once again, the action of Pr, removes the top two components, leaving the re- 
sult that, in the massless limit, only the A = +1 state survives. Recalling the 
‘hole theory’ interpretation of section 3.4.3, this would mean that the positive 
helicity components of all antifermions dominate in V-A interactions, negative 
helicity components being suppressed by factors of order m/ E. The propor- 
tionality of the negative helicity amplitude to the mass of the antifermion is 
of course exactly as noted for 7+ — utv, decay in section 18.2. 

We should emphasize that although the above results, stated in italics, 
were derived in the convenient representation (20.14) for the Dirac matrices, 
they actually hold independently of any choice of representation. This can be 
shown by using general helicity projection operators. 


In Pauli’s original letter, he suggested that the mass of the neutrino might 
be of the same order as the electron mass. Immediately after the discovery of 
parity violation, it was realized that the result could be elegantly explained 
by the assumption that the neutrinos were strictly massless particles (Landau 
1957, Lee and Yang 1957 and Salam 1957). In this case, u and v spinors 
satisfy the same equation y(u or v) = 0, which reduces via (20.21) (in the 
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m = 0 limit) to the two independent two-component ‘Weyl’ equations. 


Remembering that E = |p| for a massless particle, we see that o has positive 
helicity and xo negative helicity. In this strictly massless case, helicity is 
Lorentz invariant, since the direction of p cannot be reversed by a velocity 
transformation with v « c. Furthermore, each of the equations in (20.32) 
violates parity, since E is clearly a scalar while ø - p is a pseudoscalar (note 
that when m Æ 0 we can infer from (20.21) that, in this representation, ó + x 
under P, which is consistent with (20.32) and with the form of 8 in (20.14)). 
Thus the (massless) neutrino could be ‘blamed’ for the parity violation. In 
this model, neutrinos have one definite helicity, either positive or negative. As 
we have seen, the massless limit of the (four-component) V-A theory leads to 
the same conclusion. 

Which helicity is actually chosen by Nature was determined in a classic 
experiment by Goldhaber et al. (1958), involving the K-capture reaction 


e^ +15? Eu > v 9? Sm*, (20.33) 


as described by Bettini (2008), for example. They found that the helicity 
of the emitted neutrino was (within errors) 100% negative, a result taken as 
confirming the ‘2-component’ neutrino theory, and the V-A theory. 

We now know that neutrinos are not massless. This information does not 
come from studies of nuclear decays, but rather from a completely different 
phenomenon - that of neutrino oscillations, which we shall mention again in 
the following section, and treat more fully in section 21.4. Neutrino masses 
are so small that the existence of the ‘wrong helicity’ component cannot be 
detected experimentally in processes such as (20.33), or indeed in any of the 
reactions we shall discuss, apart from neutrino oscillations. 

In section 4.2.2 we introduced the charge conjugation operation C (see also 
section 7.5.2). As we noted there, C is not a good symmetry in weak interac- 
tions. The V-A interaction treats a negative helicity fermion very differently 
from a negative helicity antifermion, while one is precisely transformed into 
the other under C. However, it is clear that the helicity operator itself is 
odd under P. Thus the CP conjugate of a negative helicity fermion is posi- 
tive helicity antifermion, which is what the V-A interaction selects. It may 
easily be verified (problem 20.4) that the ‘2-component’ theory of (20.32) 
automatically incorporates CP invariance. Elegance notwithstanding, how- 
ever, there are CP-violating weak interactions, as mentioned in section 4.2.3. 
How this is accommodated within the Standard Model we shall discuss in 
section 20.7.3. 

For charged fermions the distinction between particle and antiparticle is 
clear; but is there a conserved quantum number which we can use instead of 
charge to distinguish a neutrino from an antineutrino? That is the question 
to which we now turn. 


20.3. Lepton number and lepton flavours 293 


ee 


20.3 Lepton number and lepton flavours 


In section 1.2.1 of volume 1 we gave a brief discussion of leptonic quantum 
numbers (‘lepton flavours’), adopting a traditional approach in which the data 
is interpreted in terms of conserved quantum numbers carried by neutrinos, 
which serve to distinguish neutrinos from antineutrinos. We must now exam- 
ine the matter more closely, in the light of what we have learned about the 
helicity properties of the V-A interaction. 

In 1995, Davis (1955) — following a suggestion made by Pontecorvo (1946) — 
argued as follows. Consider the e^ capture reaction e^ +p — v-+n, which was 
of course well established. Then in principle the inverse reaction v+n — e^ +p 
should also exist. Of course, the cross section is extremely small, but by using 
a large enough target volume this might perhaps be compensated. Specifically, 
the reaction v +37 Cl > e^ +3% Ar was proposed, the argon being detected 
through its radioactive decay. Suppose, however, that the ‘neutrinos’ actually 
used are those which accompany electrons in 8~-decay. If (as was supposed 
in section 1.2.1) these are to be regarded as antineutrinos, '7', carrying a 
conserved lepton number, then the reaction 


D +37 CEP eT +33 Ar (20.34) 


should not be observed. If, on the other hand, the ‘v’ in the capture process 
and the ‘p’ in -decay are not distinguished by the weak interaction, the 
reaction (20.34) should be observed. Davis found no evidence for reaction 
(20.34), at the expected level of cross section, a result which could clearly be 
interpreted as confirming the ‘conserved electron number hypothesis’. 

However, another interpretation is possible. The e^ in f-decay has pre- 
dominately negative helicity, and its accompanying ‘v’ has predominately pos- 
itive helicity. The fraction of the other helicity present is of the order m/F, 
where E ~ few Mev, and the neutrino mass is less than 1eV; this is, therefore, 
an almost undetectable ‘contamination’ of negative helicity component in the 
‘D. Now the property of the V-A interaction is that it conserves helicity in 
the zero mass limit (in which chirality is the same as helicity). Hence the 
positive helicity ‘D’ from 8~-decay will (predominately) produce a positive 
helicity lepton, which must be the e* not the e^. Thus the property of the 
V-A interaction, together with the very small value of the neutrino mass, con- 
spire effectively to forbid (20.34), independently of any considerations about 
‘lepton number’. 

Indeed, the ‘helicity-allowed’ reaction 


py+poettn 20.35 
p ( ) 


was observed by Reines and Cowan (1956) (see also Cowan et al. 1956). Reac- 
tion (20.35) too, of course, can be interpreted in terms of ‘D’ carrying a lepton 
number of -1, equal to that of the et. It was also established that only ‘v’ 
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produced e~ via (20.34), where ‘v’ is the helicity —1 state (or, on the other 
interpretation, the carrier of lepton number +1). 

The situation may therefore be summarized as follows. In the case of e^ 
and e*, all four ‘modes’ - e7 (A = +1),e7(A = -1),,e* (A = 41), eT(A = -1) 
— are experimentally accessible via electromagnetic interactions, even though 
only two generally dominate in weak interactions (e^ (4 = —1) and eT (A = 
+1)). Neutrinos, on the other hand, seem to interact only weakly. In their 
case, we may if we wish say that the participating states are (in association 
with e^ or et) De (A = +1) and ve(à = —1), to a very good approximation. 
But we may also regard these two states as simply two different helicity states 
of one particle, rather than of a particle and its antiparticle. As we have seen, 
the helicity rules do the job required just as well as the lepton number rules. In 
short, the question is: are these ‘neutrinos’ distinguished only by their helicity, 
or is there an additional distinguishing characteristic (‘electron number’)? 
In the latter case we should expect the ‘other’ two states Delà = —1) and 
Ve(A = +1) to exist as well as the ones known from weak interactions. 

If, in fact, no quantum number — other than the helicity — exists which 
distinguishes the neutrino states, then we would have to say that the C- 
conjugate of a neutrino state is a neutrino, not an antineutrino — that is, 
‘neutrinos are their own antiparticles’. A neutrino would be a fermionic state 
somewhat like a photon, which is of course also its own antiparticle. Such 
‘C-self-conjugate’ fermions are called Majorana fermions (Majorana 1937), in 
contrast to the Dirac variety, which have all four possible modes present (2 
helicities, 2 particle/antiparticle). We discussed Majorana fermions in sections 
4.2.2 and 7.5.2. 

The distinction between the ‘Dirac’ and ‘Majorana’ neutrino possibilities 
becomes an essentially ‘metaphysical’ one in the limit of strictly massless neu- 
trinos, since then (as we have seen) a given helicity state cannot be flipped 
by going to a suitably moving Lorentz frame, nor by any weak (or electro- 
magnetic) interaction, since they both conserve chirality which is the same as 
helicity in the massless limit. We would have just the two states ve(\ = —1) 
and (A = +1), and no way of creating v.(A = +1) or &(A = —1). The 
‘~* label then becomes superfluous. Unfortunately, the massless limit is ap- 
proached smoothly, and neutrino masses are, in fact, so small that the ‘wrong 
helicity’ supression factors will make it very difficult to see the presence of the 
possible states v, (A = +1), %(A = —1). 

One much-discussed experimental test case (see, for example, the review 
by Vogel and Piepke in Nakamura et al. 2010) concerns ‘neutrinoless double 
£-decay', which is the process A > A’ +e7 +e7, where A, A’ are nuclei. If 
the neutrino emitted in the first 6-decay carries no electron-type conserved 
quantum number, then in principle it can initiate a second weak interaction, 
exactly as in Davis’ original argument, via the diagram shown in figure 20.2. 
Note that this is a second-order weak process, so that the amplitude contains 
the very small factor G2. Furthermore, the v emitted along with the e^ 
at the first vertex will be predominately A = +1, but in the second vertex 
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FIGURE 20.2 
Double 6-decay without emission of a neutrino, a test for Majorana-type neu- 
trinos. 


the V-A interaction will ‘want’ it to have A = —1, like the outgoing e7. 
Thus there is bound to be one *m/ E? suppression factor, whichever vertex we 
choose to make ‘easy’. (In the case of 3-state neutrino mixing — see section 
21.4 — the quantity ‘m’ will be an appropriately averaged mass.) There is 
also a complicated nuclear physics overlap factor. The expected half-lives of 
neutrinoless double 8 decays depend on the decaying nucleus, but are typically 
longer than 1074 — 10?° years. Evidently, the observation of this rare process 
is a formidable experimental challenge; as yet, no confirmed observation exists 
(see also section 21.4.5). 
In the same way, ‘7”’ particles accompanying the y~’s in m~ decay 


-T —u Lp" (20.36) 
are observed to produce only u™’s when they interact with matter, not p 's. 
Again this can be interpreted either in terms of helicity conservation or in 
terms of conservation of a leptonic quantum number L,. We shall assume the 
analogous properties are true for the *7"'s accompanying 7 leptons. 
On the other hand, helicity arguments alone would allow the reaction 


‘pe +p jet -n (20.37) 


to proceed, but as we saw in section 1.2.1 the experiment of Danby et al. 
(1962) found no evidence for it. Thus there is evidence, in this type reac- 
tion, for a flavour quantum number distinguishing neutrinos which interact 
in association with one kind of charged lepton from those which interact in 
association with a different charged lepton. The electroweak sector of the 
Standard Model was originally formulated on the assumption that the three 
lepton flavours Le, L, and L, are conserved, and that the neutrinos are mass- 
less. It turns out that these two assumptions are related, in the sense that 
if neutrinos have mass, then (barring degeneracies) ‘neutrino oscillations’ can 
occur, in which a state of one lepton flavour can acquire a component of an- 
other, as it propagates. Compelling evidence accumulated during the 2000s 
for oscillations of neutrinos caused by non-zero masses and neutrino mixing. 


296 20. Introduction to the Phenomenology of Weak Interactions 


Strictly speaking, neutrino masses and oscillations lie outside the framework 
of the original Standard Model, and they are sometimes so regarded. Apart 
from anything else, the phenomenology of massive neutrinos has to allow for 
the possibility that they are Majorana, rather than Dirac, fermions. For the 
moment, we shall continue with a semi-historical path, and proceed with weak 
interaction phenomenology on the basis of the original Standard Model, with 
massless neutrinos. We return to the question of neutrino mass when we dis- 
cuss neutrino oscillations (along with analogous oscillations in meson systems) 
in chapter 21. 


EE: SeSe 


20.4 The universal current x current theory for weak 
interactions of leptons 


After the breakthroughs of parity violation and V-A theory, the earlier hopes 
(Pontecorvo 1947, Klein 1948, Puppi 1948, Lee, Rosenbluth and Yang 1949, 
Tiomno and Wheeler 1949) were revived of a universal weak interaction among 
the pairs of particles (p,n), (Ve, e ), (Vu, 47), using the V-A modification to 
Fermi’s theory. From our modern standpoint, this list has to be changed 
by the replacement of (p,n) by the corresponding quarks (u,d), and by the 
inclusion of the third lepton pair (v,,7-) as well as two other quark pairs 
(c,s) and (t,b). It is to these pairs that the ‘V-A’ structure applies, as already 
indicated in section 20.2.2, and a certain form of ‘universality’ does hold, as 
we now describe. 

Because of certain complications which arise, we shall postpone the dis- 
cussion of the quark currents until section 20.7, concentrating here on the 
leptonic currents”. In this case, Fermi’s original vector-like current bet! dy 
becomes modified to a total leptonic charged current 


J&c(leptons) = j^. (e) + 34. (u) + HM, (T) (20.38) 


where, for example, 
Jae) = Pe" 0 — 45). (20.39) 


In (20.39) we are now adopting, for the first time, a useful shorthand whereby 
the field operator for the electron field, say, is denoted by ê(x) rather than 
we(x), and the ‘x’ argument is suppressed. The ‘charged’ current terminology 
refers to the fact that these weak current operators ron carry net charge, in 
contrast to an electromagnetic current operator such as éy“é which is elec- 
trically neutral. We shall see in section 20.6 that there are also electrically 
neutral weak currents. 


?Very much the same complications arise for the leptonic currents too, in the case of 
massive neutrinos, as we shall see in section 21.4. 
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The interaction Hamiltonian density accounting for all leptonic weak in- 
teractions is then taken to be 


: Qs. " 
Hee, E pec (leptons) ic, (leptons). (20.40) 


Note that E 7 
(Bey (1 — 55)8)! = ê” (1 — 3). (20.41) 


and similarly for the other bilinears. The currents can also be written in terms 
of the chiral components of the fields (recall section 20.2.2) using 


Were, = Puy" (1— 5)6, (20.42) 


for example. ‘Universality’ is manifest in the fact that all the lepton pairs 
have the same form of the V-A coupling, and the same ‘strength parameter’ 
Gr / V2 multiplies all of the products in (20.40). 

The terms in (20.40), when it is multiplied out, describe many physical 
processes. For example, the term 


GF z ^A A 
Fen (th — *s) êu (1. — 15) Pe (20.43) 
describes j^ decay: 

U —> Vu +E t5, (20.44) 
as well as all the reactions related by ‘crossing’ particles from one side to the 


other, for example 
Vu +E +p Tv. (20.45) 


The value of Gp can be determined from the rate for process (20.44) (see for 
example Renton 1990, section 6.1.2), and it is found to be 


Gr c 1.166 x 10-°GeV~. (20.46) 


This is a convenient moment to notice that the theory is not renormalizable 
according to the criteria discussed in section 11.8 at the end of the previous 
volume: Gp has dimensions (mass)~?. We shall return to this aspect of Fermi- 
type V-A theory in section 22.1. 

There are also what we might call ‘diagonal’ terms in which the same 
lepton pair is taken from j^, and Poe w for example 


G ^ AR A 
ZE eg" (1 — Ys)ê ey (1 — 75) Pe (20.47) 


V2 


which describes reactions such as 
De +e > De +e. (20.48) 


The cross section for (20.48) was measured by Reines, Gurr and Sobel (1976) 
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after many years of effort; the value obtained was consistent with the Glashow- 
Salam-Weinberg theory (see section 22.3), with the parameter sin? 0w = 
0.29 + 0.05. 

It is interesting that some seemingly rather similar processes are forbidden 


S aol 
to occur, to first order in ?& T, for example 


Du HE > Dp HE. (20.49) 


For reasons which will become clearer in section 20.6, (20.49) is called a ‘neu- 
tral current’ process, in contrast to all the others (such as 6-decay or u-decay) 
we have discussed so far, which are called ‘charged current’ processes. If the 
lepton pairs are arranged so as to have no net lepton number (for example 
€ De, M vy, Vuy etc.) then pairs with non-zero charge occur in charged cur- 
rent processes, while those with zero charge participate in neutral current 
processes. In the case of (20.48), the leptons can be grouped either as (Dee7) 
which is charged, or as (HV) or (e*e^) which are neutral. On the other 
hand, there is no way of pairing the leptons in (20.49) so as to cancel the lep- 
ton number and have non-zero charge. So (20.49) is a purely ‘neutral current’ 
process, while some ‘neutral current’ contribution could be present in (20.48), 
in principle. In 1973 such neutral current processes were discovered (Hasert 
et al. 1973), generating a whole new wave of experimental activity. Their ex- 
istence had, in fact, been predicted in the first version of the Standard Model, 
due to Glashow (1961). Today we know that charged current processes are 
mediated by the WF bosons, and the neutral current ones by the Z?. We shall 
discuss the neutral current couplings in section 20.6. 


20.5 Calculation of the cross section for v, +67 > u tv. 


After so much qualitative discussion it is time to calculate something. We 
choose the process (20.45), sometimes called inverse muon decay, which is a 
pure ‘charged current’ process. The amplitude, in the Fermi-like V-A current 
theory, is 


M = -i(Gr/ V2)u(u, Ky, (1 — 95)u(Yp, k)ü(ve, p)" (1 — 5)u(e, p). (20.50) 


We shall be interested in energies much greater than any of the leptons, and 
so we shall work in the massless limit; this is mainly for ease of calculation — 
the full expressions for non-zero masses can be obtained with more effort. 
From the general formula (6.129) for 2 — 2 scattering in the CM system, 
we have, neglecting all masses, 
da 1 pvp 


dO 6472s 


(20.51) 


where |M|? is the appropriate spin-averaged matrix element squared, as in 
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(8.183) for example. In the case of neutrino-electron scattering, we must aver- 
age over initial electron states for unpolarized electrons and sum over the final 
muon polarization states. For the neutrinos there is no averaging over initial 
neutrino helicities, since only left-handed (massless) neutrinos participate in 
the weak interaction. Similarly, there is no sum over final neutrino helicities. 
However, for convenience of calculation, we can in fact sum over both helicity 
states of both neutrinos since the (1 — y5) factors guarantee that right-handed 
neutrinos contribute nothing to the cross section. As for the eu scattering 
example in section 8.7, the calculation then reduces to a product of traces: 


2 


Pap = (SE) murs. a = a) Ies = so més — 95) Y — o9 


(20.52) 
all lepton masses being neglected. We define 
—— G2 
[M]? = (S) NEP (20.53) 


where the v, — u^ tensor Ny, is given by 


Ny, = Tr[E v, (1 = ys) Kev (E — 98] (20.54) 


without a 1/(2s + 1) factor, and the e^ — ve tensor is 


EW = FTPA — 53) Y" 53) (20.55) 


including a factor of i for spin averaging. 

Since this calculation involves a couple of new features, let us look at it in 
some detail. By commuting the (1 — y5) factor through two y matrices ( py’) 
and using the result that 


(1 — 95)? = 2(1 — 95) (20.56) 
the tensor N,, may be written as 


Ny = 2Tr[E y a ys) ky] 
= 2Tr( hy Kv) — 2Tr(ys Kv Ku). (20.57) 


The first trace is the same as in our calculation of ew scattering (cf (8.186)): 
Tr(H I, Ki) = A[kj ky + kk, + (97/2) 9 u0I- (20.58) 

The second trace must be evaluated using the result 
Tr(4s d V d d) = 4icapysa%b? c? d? (20.59) 


(see equation (J.37) in appendix J of volume 1). The totally antisymmetric 
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tensor €yg6 is just the generalization of €;;, to four dimensions, and is defined 
by 


+1 for €9123 and all even permutations of 0,1, 2,3 
Ea bys = —1 for €1023 and all odd permutations of 0,1,2,3 
0 otherwise. 
(20.60) 


Its appearance here is a direct consequence of parity violation. Notice that 
this definition has the consequence that 


€9123 = +1 (20.61) 


but. 
eus ec, (20.62) 


We will also need to contract two e tensors. By looking at the possible com- 
binations, it should be easy to convince yourself of the result 


Oj j 
ijkCilm = 20.63 
SU | Ont Ókm ( ) 
Le. 
€igk€ilm = 9510km — ÓkiÓjm. (20.64) 
For the four-dimensional e tensor one can show (see problem 20.6) 
01 ò 
€uro ge 9 = —21 5 ri (20.65) 
a cB 


where the minus sign arises from (20.62) and the 2! from the fact that the two 
indices are contracted. 

We can now evaluate N,v. We obtain, after some rearrangement of indices, 
the result for the v, — u^ tensor: 


Ny, = 8[(K5 ky + krky + (q?/2)gu7) — ieavogk? K^]. (20.66) 
For the electron tensor E"" we have a similar result (divided by 2): 
E"" = A[(p"^p" + p” p" + (q?/2)g"") — ie"? p. pj]. (20.67) 


Next, we have to perform the contraction N „y E"" in (20.53). In the case 
of elastic e~ u^ scattering considered in section 8.7, the analogous contraction 
between the tensors L,, and M"" was simplified by using the conditions 
q" L,, = q” Lj, = 0 (see (8.189)), which followed from electromagnetic current 
conservation at the electron vertex (see (8.188)): q"u(k')y,u(k) = 0. Here, the 
analogous vertex is u(y, k')y,(1 — y5)u(vyp, k). In this case, when we contract 
this with q” = (k — k’)" we find a non-zero result: 


(mp, — m,)ü(u, k')u(v,, k) + (m, + m,, Ulu, k')nsu(v,, k), (20.68) 
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using the on-shell conditions for the spinors. (In the electromagnetic case, 
there was no y5 term, and the intial and final masses were the same.) The 
quantity (20.68) vanishes only when the lepton masses vanish, and that is the 
approximation we shall make: i.e. we shall neglect all lepton masses. Then 


q" Nw = q’ Nw = 0, (20.69) 


and we may write 
p —pcq (20.70) 


and drop all terms involving q in the contraction with N,,. In the antisym- 
metric term, however, we have 


e», (ps + qs) = e"? p.q; (20.71) 


since the term with ps vanishes because of the antisymmetry of e,,45. Thus 
we arrive at 
E = 8p'p" + 2g^g"" _ dict” Pn, qs. (20.72) 


We must now evaluate the ‘N - E? contraction in (20.53). Since we are 
neglecting all masses, it is easiest to perform the calculation in invariant form 
before specializing to the ‘laboratory’ frame. The usual Mandelstam variables 
are (neglecting all masses) 


s = 2k-p (20.73) 
u = —2k'-p (20.74) 
t = 2k k=? (20.75) 
satisfying 
s+t+u=0. (20.76) 


The result of performing the contraction 
NE = Np Elg (20.77) 


may be found using the result (20.65) for the contraction of two e tensors (see 
problem 20.6): the answer for v,e^ — u v. is 


Nyy EX” = 16(s? + u?) + 16(s? — u?) (20.78) 


where the first term arises from the symmetric part of N,» similar to Luv, 
and the second term from the antisymmetric part involving e;;45. We have 
also used 

t=@ =-(s+u) (20.79) 


valid in the approximation in which we are working. Thus for v,e^ — uU ve 
we have 
N,,E"" = +328? (20.80) 
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and with à 
do 1 Gh 
— = —— | — ] N,, EY” 20.81 
dQ sux (SE) ey 
we finally obtain the result 
do Gis 
dO ^ (20.82) 
'The total cross section is then 
G2 
= oF (20.83) 
T 
Since t = —2p?(1 — cos 0), where p is the CM momentum and 0 the CM 


scattering angle, (20.82) can alternatively be written in invariant form as 
(problem 20.7) 
do 2 G2 
dt m 
All other purely leptonic processes may be calculated in an analogous fashion 
(see Bailin 1982 and Renton 1990 for further examples). 

When we discuss deep inelastic neutrino scattering in section 20.7.2, we 
shall be interested in neutrino ‘laboratory’ cross sections, as in the electron 
scattering case of chapter 9. A simple calculation gives s ~ 2mm. E (neglecting 
squares of lepton masses by comparison with me E), where E is the ‘laboratory’ 
energy of a neutrino incident, in this example, on a stationary electron. It 
follows that the total ‘laboratory’ cross section in this Fermi-like current- 
current model rises linearly with E. We shall return to the implications of this 
in section 20.7.2. 

The process (20.45) was measured by Bergsma et al. (1983) using the 
CERN wide band beam (E, ~ 20 GeV). The ratio of the observed number 
of events to that expected for pure V-A was quoted as 0.98+0.12. 


(20.84) 


E: SSe 
20.6 Leptonic weak neutral currents 


The first observations of the weak neutral current process v,e ^ — v,e were 
reported by Hasert et al. (1973), in a pioneer experiment using the heavy- 
liquid bubble chamber Gargamelle at CERN, irradiated with a P, beam. As 
in the case of the charged currents, much detailed experimental work was 
necessary to determine the precise form of the neutral current couplings. They 
are, of course, predicted by the Glashow-Salam- Weinberg theory, as we shall 
explain in chapter 22. For the moment, we continue with the current-current 
approach, parametrizing the currents in a convenient way. 

There are two types of ‘neutral current! couplings, those involving neutri- 
nos of the form 2; ...£;, and those involving the charged leptons of the form 
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i... i. We shall assume the following form for these currents (with one eye on 
the GSW theory to come): 


(1) neutrino neutral current 


C i 
gnc yy" ( £) bj l=e,y,T; (20.85) 


(2) charged lepton neutral current 


guh” eo aE) i l—-e,u. (20.86) 
This is, of course, by no means the most general possible parametrization. 
The neutrino coupling is retained as pure ‘V-A’, while the coupling in the 
charged lepton sector is now a combination of ‘V-A’ and ‘V+A’ with certain 
coefficients er and ch. We may also write the coupling in terms of ‘V’ and 
‘A’ coefficients defined by cl, =c + ch, ch =d s ch. An overall factor gn 
determines the strength of the neutral currents as compared to the charged 
ones; the c’s determine the relative amplitudes of the various neutral current 
processes. 

As we shall see, an essential feature of the GSW theory is its prediction 
of weak neutral current processes, with couplings determined in terms of one 
parameter of the theory called ‘Ow’, the ‘weak mixing angle’ (Glashow 1961, 
Weinberg 1967). The GSW predictions for the parameter gw and the c’s are 
(see equations (22.59)-(22.62)) 

Vi l 1 l 
gn = g/cosÜw, c" = = a =-7ta R=a (20.87) 
for | = e, u, T, where a = sin? Ow and g is the SU(2) gauge coupling. Note 
that a strong form of ‘universality’ is involved here too: the coefficients are 
independent of the ‘flavour’ e, u or T, for both neutrinos and charged leptons. 

The following reactions are available for experimental measurement (in 
addition to the charged current process (20.45) already discussed): 


v,e +> Wye, Hye — vye (NC) (20.88) 
we > Ke, De >Re (NC+CC) (20.89) 


where ‘NC’ means neutral current and ‘CC’ charged current. Formulae for 
these cross sections are given in section 22.3. The experiments are discussed 
and reviewed in Commins and Bucksbaum (1983), Renton (1990), and by 
Winter (2000). All observations are in excellent agreement with the GSW 
predictions, with Oy determined as sin? 6w ~ 0.23. The reader must note, 
however, that modern precision measurements are sensitive to higher-order 
(loop) corrections, which must be included in comparing the full GSW theory 
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with experiment (see section 22.6). The simultaneous fit of data from all four 
reactions in terms of the single parameter Ow provides already strong confir- 
mation of the theory — and indeed such confirmation was already emerging 
in the late 1970’s and early 1980's, before the actual discovery of the WS 
and Z? bosons. It is also interesting to note that the presence of vector (V) 
interactions in the neutral current processes may suggest the possibility of 
some kind of link with electromagnetic interactions, which are of course also 
'neutral' (in this sense) and vector-like. In the GSW theory, this linkage is 
provided essentially through the parameter 0w, as we shall see. 


E: See 


20.7 Quark weak currents 


We now turn our attention to the weak interactions of quarks. We shall begin 
by considering an earlier world, when only two generations (four flavours) 
were known. 


20.7.1 Two generations 


The original version of V-A theory was framed in terms of a nucleonic current 
of the form ýp” — rosin. With the acceptance of quark substructure it 
was natural to re-interpret such a hadronic transition by a charged current of 
the form ûy”( 1—»5)d, very similar to the charged lepton currents; indeed, here 
was a further example of ‘universality’, this time between quarks and leptons. 
Detailed comparison with experiment showed, however, that such d > u 
transitions were very slightly weaker than the analogous leptonic ones; this 
could be established by comparing the rates for n —^ pe i, and fi — vye De. 

But for quarks (or their hadronic composites) there is a further complica- 
tion, which is the very familiar phenomenon of flavour change in weak hadronic 
processes (recall the discussion in section 1.2.2). The first step towards the 
modern theory of quark currents was taken by Cabibbo (1963); in a sense, it 
restored universality. Cabibbo postulated that the strength of the hadronic 
weak interaction was shared between the AS = 0 and AS = 1 transitions 
(where S is the strangeness quantum number), the latter being relatively 
suppressed as compared to the former. According to Cabibbo's hypothesis, 
phrased in terms of quarks, the total weak charged current for u, d and s 
quarks is 


1—75). 


d + sin poi 29) (20.90) 


» 7 [E 
Jap (Us d, s) = COS Boñ 8) 


where 0c is the ‘Cabibbo angle’ (not to be confused with Ow). We can now 
postulate a total weak charged current 


jig (total) = jBc (leptons) + Faw (Us d,s), (20.91) 
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FIGURE 20.3 
Strangeness-changing semi-leptonic weak decays. 


where de (leptons) is given by (20.38), and then generalize (20.40) to 
Hist, = CE vy (total) jhe, (total) (20.92) 
CoS Wee 94/J6Cu : : 


The effective interaction (20.92) describes a great many processes. The 
purely leptonic ones discussed previously are, of course, present in the term 
de (leptons)j sc (leptons). But there are also now all the semi-leptonic pro- 
cesses such as the AS = 0 (strangeness conserving) one 


douce +k, (20.93) 
and the AS — 1 (strangeness changing) one 
suce EX. (20.94) 


The notion that the ‘total current’ should be the sum of a hadronic and a 
leptonic part is already familiar from electromagnetism — see, for example, 
equation (8.91). 
The transition (20.94), for example, is the underlying process in semi- 
leptonic decays such as 
XM once + (20.95) 


and 
K > T? re t5 (20.96) 


as indicated in figure 20.3. 

The ‘s’ quark is assigned S = —1 and charge —ie. The s > u transi- 
tion is then referred to as one with ‘AS = AQ’, meaning that the change 
in the quark (or hadronic) strangeness is equal to the change in the quark 
(or hadronic) charge: both the strangeness and the charge increase by 1 unit. 
Prior to the advent of the quark model, and the Cabibbo hypothesis, it had 
been established empirically that all known strangeness-changing semileptonic 
decays satisfied the rules |AS| = 1 and AS = AQ. The u-s current in (20.90) 
satisfies these rules automatically. Note, for example, that the process appar- 
ently similar to (20.95), + + n +e” + ve, is forbidden in the lowest order (it 
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requires a double quark transition from suu to udd). All known data on such 
decays can be fit with a value sin 0c ~ 0.22 for the Cabibbo angle 0c. This 
relatively small angle is therefore a measure of the suppression of |AS| = 1 
processes relative to AS — 0 ones. 
The Cabibbo current can be written in a more compact form by introduc- 
ing the *mixed' field 
d' = cos cd + sin 6c5. (20.97) 


'Then 
P. m i A 
JCab (u, d, s) T QV (20.98) 


In 1970 Glashow, Iliopuolos and Maiani (GIM) (1970) drew attention to 
a theoretical problem with the interaction (20.92) if used in second order. 
Now it is, of course, the case that this interaction is not renormalizable, as 
noted previously for the purely leptonic one (20.40), since Gr has dimensions 
of an inverse mass squared. As we saw in section 11.7, this means that one- 
loop diagrams will typically diverge quadratically, so that the contribution 
of such a second-order process will be of order (Gr.GpA?) where A is a cut- 
off, compared to the first-order amplitude Gp. Recalling from (20.46) that 
Gp ~ 10-9? GeV ?, we see that for A ~ 10 GeV such a correction could 
be significant if accurate enough data exists. GIM pointed out, in particular, 
that some second-order processes could be found which violated the (hitherto) 
well-established phenomenological selection rules, such as the |AS| — 1 and 
AS — AQ rules already discussed. For example, there could be AS — 2 
amplitudes contributing to the Kr, — Kg mass difference (see Renton 1990, 
section 9.1.6, for example), as well as contributions to unobserved decay modes 
such as 


Kt sat+vt+o (20.99) 


which has a neutral lepton pair in association with a strangeness change for 
the hadron. In fact, experiment placed very tight limits on the rate for (20.99) 
— and still does: the branching fraction is (1.7 + 1.1) x 1071? (Nakamura et 
al. 2010). This seemed to imply a surprisingly low value of the cut-off, say 
^ 3 GeV (Mohapatra et al. 1968). 

Partly in order to address this problem, and partly as a revival of an 
earlier lepton-quark symmetry proposal (Bjorken and Glashow 1964), GIM 
introduced a fourth quark, now called c (the charm quark) with charge Ze. 
Note that in 1970 the 7-lepton had not been discovered, so only two lepton 
family pairs (Ve, e), (v, 4) were known; this fourth quark therefore did restore 
the balance, via the two quark family pairs (u,d), (c,s). In particular, a 
second quark current could now be hypothesized, involving the (c,s) pair. GIM 
postulated that the c-quark was coupled to the ‘orthogonal’ d-s combination 
(cf (20.97)) 


al 


&' = —sinOcd + cos. (20.100) 
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The complete four-quark charged current is then 


Jem (u, d, c, 8) = iy toe + ep my (20.101) 
The form (20.101) had already been suggested by Bjorken and Glashow (1964). 
The new feature of GIM was the observation that, assuming an exact SU(4)¢ 
symmetry for the four quarks (in particular, equal masses), all second-order 
contributions which could have violated the |AS| — 1, AS — AQ selection 
rules now vanished. Further, to the extent that the (unknown) mass of the 
charm quark functioned as an effective cut-off A, due to breaking of the SU(4)¢ 
symmetry, they estimated m. to lie in the range 3-4 GeV, from the observed 
Ky — Kg mass difference. 

GIM went on to speculate that the non-renormalizability could be over- 
come if the weak interactions were described by an SU(2) Yang-Mills gauge 
theory, involving a triplet (W+, W-, W°) of gauge bosons. In this case, it 
is natural to introduce the idea of (weak) SDR in terms of which the 
pairs (ve, e), (vy, p), (u,d’), (c, s') are all t = 4 doublets with t3 = +3. 
Charge-changing currents then involve the raising. matrix 


1 f 0 1 
T} = Piu + itz) = ( 0 0 ) (20.102) 
and charge-lowering ones the matrix T. = (mı — ir2)/2. The full symme- 
try must also involve the matrix 73, given by the commutator [71,7_] = 73. 


Whereas T} and r_ would (in this model) be associated with transitions me- 
diated by W*, transitions involving 73 would be mediated by W°, and would 
correspond to ‘neutral current! transitions for quarks. We now know that 
things are slightly more complicated than this: the correct symmetry is the 
SU(2) x U(1) of Glashow (1961), also invoked by GIM. Skipping therefore 
some historical steps, we parametrize the weak quark neutral current as (cf 
(20.86) for the leptonic analogue) 


= Ec 1+ " 
ON 5 ir U8) + a y (20.103) 


q=u,c,d’,s’ 


for the four flavours so far in play. In the GSW theory, the c}’s are predicted 
to be 


1 2 2 

ce = 3 — 3" CR = ^34 (20.104) 
1 1 1 

a = E yk 3^ ll = 3? (20.105) 


where a = sin? Ow as before, and gx = g/ cos Ow. 
One feature of (20.103) is very important. Consider the terms 


tobe Ea v (20.106) 


308 20. Introduction to the Phenomenology of Weak Interactions 


It is simple to verify that, whereas either part of (20.106) alone contains a 


strangeness changing neutral combination such as d{...}8 or 3(...)d, such 
combinations vanish in the sum, leaving the result diagonal in quark flavour. 
Thus there are no first-order neutral flavour-changing currents in this model, 
a result which will be extended to three flavours in section 20.7.3. 

In 1974, Gaillard and Lee (1974) performed a full one-loop calculation 
of the Kr, — Kg mass difference in the GSW model as extended by GIM 
to quarks and using the renormalization techniques recently developed by ’t 
Hooft (1971b). They were able to predict me ~1.5 GeV for the charm quark 
mass, a result spectacularly confirmed by the subsequent discovery of the cc 
states in charmonium, and of charmed mesons and baryons of the appropriate 
mass. 

In summary, then, the essential feature of the quark weak currents in 
the two-generation model is that they have the universal V-A form, but the 
participating fields are (à, d'), (é, PA where d' and $' are not the fields d, 3 with 
definite mass, but rather are related to them by an orthogonal transformation: 


d' u cosÜc sing d 
( s! ) B ( —sinüc cosc $8]. (20:107) 
In section 20.8 we shall enlarge this picture to three generations, where signif- 
icant new features occur, specifically CP violation. In chapter 22 we shall see 
how this transformation from the ‘mass’ basis to the ‘weak interaction’ basis 
arises via the gauge-invariant interactions of the Standard Model. 


20.7.2 Deep inelastic neutrino scattering 


We now have enough theory to present another illustrative calculation within 
the framework of the ‘current-current’ model, this time involving neutrinos 
and quarks. We shall calculate cross sections for deep inelastic neutrino scat- 
tering from nucleons, using the parton model introduced (for electromagnetic 
interactions) in chapter 9. In particular, we shall consider the processes 


YyatN > pw +X (20.108) 
Du +tN > pr+x (20.109) 


which of course involve the charged currents, for both leptons and quarks. 
Studies of these reactions at Fermilab and CERN in the 1970s and 1980s 
played a crucial part in establishing the quark structure of the nucleon, in 
particular the quark distribution functions. 

The general process is illustrated in figure 20.4. By now we are becoming 
accustomed to the idea that such processes are in fact mediated by the WT, 
but we shall assume that the momentum transfers are such that the W- 
propagator is effectively constant. The effective lepton-quark interaction will 
then take the form 


Hee = ZE ia — y)dy — y5)d + &y^(1 — 95) 5], (20.110) 
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FIGURE 20.4 
Inelastic neutrino scattering from a nucleon. 


leading to expressions for the parton-level subprocess amplitudes which are 
exactly similar to that in (20.50) for v, + e7 — u` + ve. Note that we are 
considering only the four flavours u, d, c, s to be ‘active’, and we have set 
0c m 0. 

As in (20.53), the v, cross section will have the general form 


do? x Ny Wisla p) (20.111) 


where N,, is the neutrino tensor of (20.67). The form of the weak hadron 
tensor we ) is deduced from Lorentz invariance. In the approximation of 
neglecting lepton masses, we can ignore any dependence on the 4-vector q 
since 

Nyy = q" Nw = 0. (20.112) 


Just as N,,, contains the pseudotensor €,,45 so too will WH, ) since parity is 


not conserved. In a manner similar to equation (9.10) for the case of electron 
scattering, and following the steps that led from (20.67) to (20.72), we define 
effective neutrino structure functions by 


v V v 1 D v i v V 
wE = (-g^")wt^ + FPP"? wP -— 21778 Yp gW. ^ (20.113) 
In general, the structure functions depend on two variables, say Q? and v, 
where Q? = —(k — k')? and v = p-q/M; but in the Bjorken limit approximate 
scaling is observed, as in the electron case: 


Q? > oo _ p72 
E x = Q^/2Mv fixed (20.114) 
vWí?(Q?,v) + FL (2) (20.115) 
2 ? 2 , 
MW$?(Q?;v) => Ff?(z) (20.116) 
yw? (Q? v) 5 F(a) (20.117) 


where, as with (9.21) and (9.22), the physics lies in the assertion that the F’s 
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are finite. This scaling can again be interpreted in terms of pointlike scattering 
from partons — which we shall take to have quark quantum numbers. 

In the ‘laboratory’ frame (in which the nucleon is at rest) the cross section 
in terms of Wi, W2 and Ws may be derived in the usual way from (cf equation 
(9.11)) 


, Gane A , dk 
do? = (S) ak p M NW say (20.118) 


In terms of ‘laboratory’ variables, one obtains (problem 20.9) 


d'a. GZK 
dQ?dv 2 k 


k / 
(wi? cos? (0/2) + W1?2 sin? (0/2) + t sin?(o/2)W” ) 


(20.119) 
For an incoming antineutrino beam, the W3 term changes sign. 
In neutrino scattering it is common to use the variables z,v and the ‘in- 
elasticity’ y where 
y=p-q/p-k. (20.120) 


In the ‘laboratory’ frame, v = E — E’ (the energy transfer to the nucleon) and 
y =v/E. The cross section can be written in the form (see problem 20.9) 


2 (v) 2 vl 1—y)? Pe dicm 2 
"en - SE. (ri ponte V) + oF! pono y) ) (20.121) 


in terms of the Bjorken scaling functions, and we have assumed the relation 
2sF(? = FY) (20.122) 


appropriate for spin- i constituents. 

We now turn to the parton-level subprocesses. Their cross sections can be 
straightforwardly calculated in the same way as for v,e~ scattering in section 
20.5. We obtain (problem 20.10) 


da G2 Q? 
pg: —— = E E 20.12 
vq, vq "EET = 8x0 (e zm (20.123) 
da G2 Q? 
q, Pq: —— = -—Esz(1—- yYyó|x—-——|. 20.124 
VEU E Ls - y) (e xz) (20.124) 


The factor (1 — y)? in the vq, vq cases means that the reaction is forbidden 
at y — 1 (backwards in the CM frame). This follows from the V-A nature of 
the current, and angular momentum conservation, as a simple helicity argu- 
ment shows. Consider for example the case vq shown in figure 20.5, with the 
helicities marked as shown. In our current-current interaction there are no 
gradient coupling terms and therefore no momenta in the momentum-space 
matrix element. This means that no orbital angular momentum is available 
to account for the reversal of net helicity in the initial and final states in figure 
20.5. The lack of orbital angular momentum can also be inferred physically 
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y - T" 
i ced —— —- — 
(a) (b) 
FIGURE 20.5 


Suppression of v,d — u~ q for y = 1: (a) initial state helicities; (b) final state 
helicities at y = 1. 


from the ‘pointlike’ nature of the current-current coupling. For the vq or vq 
cases, the initial and final helicities add to zero, and backward scattering is 
allowed. 

'The contributing processes are 


vd — Vu pd > Itt (20.125) 


, Pu ltd, (20.126) 


I- 
I- 


ul 


vu —> 


the first pair having the cross section (20.123), the second (20.124). Following 
the same steps as in the electron scattering case (sections 9.2 and 9.3) we 
obtain 


Fy? = FR" = 2a (d(x) + (a) (20.127) 
Fy? = FR” = 2[d(z) — u(z)] (20.128) 
Fi = Fy? = Q2[u(x) + d(x)| (20.129) 
Fy" = FẸ -2[w(z)-d(a). (20.130) 
Inserting (20.127) and (20.128) into (20.121), for example, we find 
d2g (vb) i 
deg 7 202{d(a) + (1 — ^a) (20.131) 
where B 5 
ME 
oo = ces = om c 1.5 x 107*(E/GeV)m? (20.132) 


is the basic ‘pointlike’ total cross section (compare (20.83)). Note the small 
magnitude of this cross section, as compared with the electromagnetic one of 
equation (B.18) in volume 1, which was o z Wan) x 107?*m?. Similarly, 


one finds 


d2g PP) 
= 2coz|(1 — y) u(x) + d(z)]. (20.133) 


dady 


The corresponding results for vn and rn are given by interchanging u(x) and 
d(x), and u(x) and d(x). 
The target nuclei usually have high mass number (in order to increase the 


cross section), with approximately equal numbers of protons and neutrons; it 
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is then appropriate to average the ‘n’ and ‘p’ results to obtain an ‘isoscalar’ 
cross section a "M or oN): 


d2g CN) 

oa. eS y)^q(x)] (20.134) 
d25 PN) 

dady oox|(1 — y)^a(z) + a()] (20.135) 


where q(x) = u(x) + d(x) and q(x) = u(x) + d(z). 
Many simple and striking predictions now follow from these quark parton 
results. For example, by integrating (20.134) and (20.135) over x we can write 


(VN) n 

2 = lQ + (- y?) (20.136) 
(ON) m 

T = edü-»*9« | (20.137) 


where Q = f zq(x)dx is the fraction of the nucleon’s momentum carried by 
quarks, and similarly for Q. These two distributions in y (‘inelasticity dis- 
tributions’) therefore give a direct measure of the quark and antiquark com- 
position of the nucleon. Figure 20.6 shows the inelasticity distributions as 
reported by the CDHS collaboration (de Groot et al. 1979), from which the 
authors extracted the ratio 


Q/(Q +Q) = 0.15 + 0.03 (20.138) 


after applying radiative corrections. An even more precise value can be ob- 
tained by looking at the region near y = 1 for YN which is dominated by Q, 
the small Q contribution (x (1 — y)?) being subtracted out using vN data at 
the same y. This method yields 


Q/(Q +Q) = 0.15 + 0.01. (20.139) 


Integrating (20.136) and (20.137) over y gives 


1- 
oN) = (Q+ 39 (20.140) 
Í 1 - 
gw = eo(3Q +Q) (20.141) 
and hence 
Q4 Q 2 3(o6 "X + oM) /400 (20.142) 
while 


0/(9+Q) = I (==) (20.143) 
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FIGURE 20.6 

Charged-current inelasticity (y) distribution as measured by CDHS; figure 
from K Winter (2000) Neutrino Physics 2nd edn, courtesy Cambridge Uni- 
versity Press. 


where r = oN) /g("N), From total cross section measurements, and including 
c and s contributions, the CHARM collaboration (Allaby et al. 1988) reported 


Q+ 
Q/Q+ 


The second figure is in good agreement with (20.139), and the first shows that 
only about 50% of the nucleon momentum is carried by charged partons, the 
rest being carried by the gluons, which do not have weak or electromagnetic 
interactions. 

Equations (20.140) and (20.141), together with (20.132), predict that the 
total cross sections o”N and aN rise linearly with energy E. This (parton 
model) prediction was confirmed as early as 1975 (Perkins 1975), soon after 
the model’s success in deep inelastic electron scattering; later data is included 
in figure 20.7. In fact, both c"N/E and o’N/E are found to be independent 
of E up to E ~ 350 GeV (Nakamura et al. 2010). 

Detailed comparison between the data at high energies and the earlier data 
of figure 20.7 at E, up to 15 GeV reveals that the Q fraction is increasing 
with energy. This is in accordance with the expectation of QCD corrections 
to the parton model (section 15.6): the Q distribution is large at small z, 


0.492 + 0.006(stat) + 0.019(syst) (20.144) 
0.154 + 0.005(stat) + 0.011(syst). (20.145) 


© © 
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c(1075 cm?/nucleon) 


FIGURE 20.7 
Low-energy v and P cross-sections; figure from K Winter (2000) Neutrino 
Physics 2nd edn, courtesy Cambridge University Press. 


and scaling violations embodied in the evolution of the parton distributions 
predict a rise at small x as the energy scale increases. 

Returning now to (20.127)-(20.130), the two sum rules of (9.65) and (9.66) 
can be combined to give 


es f duero ema 83] (20.146) 
0 
= 3] de(FY? + FY?) (20.147) 
= n E (20.148) 
0 


which is the Gross-Llewellyn Smith sum rule (1969), expressing the fact that 
the number of valence quarks per nucleon is three. The CDHS collaboration 
(de Groot et al. 1979), quoted 


1 
Icnis = i dupyh = 3.2 4 0.5. (20.149) 
0 


In perturbative QCD there are corrections expressible as a power series in Qs, 
so that the parton model result is only reached as Q? — oo: 


Ianus (Q?) = 3[1 + dios /m + dga?/n? ^ ...] (20.150) 
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x = 0.045 (*6) $ i 
x= m m i 


m-——2—8—8—B£8—g5— x- 0.080 (*3.5) 


— x = 0.125 (*2) 
X c x 20.175 (*1.5) 


a tee» ae Ho x = 0.225 (*1.2) 
FI 
x = 0.275 


x= 0.350 


xF,(x,0") 


x — 0.450 
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FIGURE 20.8 

CCFR neutrino-iron structure functions oF” (Shaevitz et al. 1995). The 
solid line is the next-to-leading order (one-loop) QCD prediction, and the 
dotted line is an extrapolation to regions outside the kinematic cuts for the 
fit. 


where d; = —1 (Altarelli et al. 1978a, 1978b), dp = —55/12 + N¢/3 (Gorish- 
nii and Larin 1986) where N; is the number of active flavours. The CCFR 
collaboration (Shaevitz et al. 1995) measured laris in antineutrino-nucleon 
scattering at (Q?) ~ 3GeV?. They obtained 


Tatts ((Q?) = 3 GeV?) = 2.50 + 0.02 + 0.08 (20.151) 


in agreement with the O(o2) calculation of Larin and Vermaseren (1991) using 
Aus = 250+ 50MeV. 

The predicted Q? evolution of xF; is particularly simple since it is not 
coupled to the gluon distribution. To leading order, the xF} evolution is 
given by (cf (15.109)) 


d " 2 1 d 
dln Q? (zF; (z,Q°)) = 222i Pag(z)v Fs (5o?) a (20.152) 


Figure 20.8, taken from Shaevitz et al. (1995) shows a comparison of the 
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CCFR data with the next-to-leading order calculation of Duke and Owens 
(1984). This fit yields a value of a, at Q? = M2 given by 


o (Mz) = 0.111 + 0.002 + 0.003. (20.153) 


The Adler sum rule (Adler 1963) involves the functions Fy? and FP: 
1 
dr, 
Ke J EA (F pun. (20.154) 
0 HH 


In the simple model of (20.127)-(20.130), the right-hand side of I, is just 


2 | dx(u(x) + d(x) — d(x) — ü(a)) (20.155) 
0 


which represents four times the average of Iz (isospin) of the target, which is 
i for the proton. This sum rule follows from the conservation of the charged 
weak current (as will be true in the Standard Model, since this is a gauge 
symmetry current, as we shall see in the following chapter). Its measurement, 
however, depends precisely on separating the non-isoscalar contribution (IA 
vanishes for the isoscalar average ‘N’). The BEBC collaboration (Allasia et al. 
1984, 1985) reported: 


Ia = 2.02 + 0.40; (20.156) 


in agreement with the expected value 2. 
Relations (20.127)-(20.130) allow the F5 functions for electron (muon) and 
neutrino scattering to be simply related. From (9.58) and (9.61) we have 


1 5 et 
p aU" + FS") = ign +u+d+d)+ ge +8)+--- (20.157) 
while (20.127) and (20.129) give 
1 = 
pu 3 (Fa? + Fy”) =z(u+d+u+d). (20.158) 


Assuming that the non-strange contributions dominate, the neutrino and 
charged lepton structure functions should be approximately in the ratio 18/5, 
which is the reciprocal of the mean squared charged of the u and d quarks 
in the nucleon. Figure 20.9 shows the neutrino results on Fə and zF} to- 
gether with those from several uN experiments scaled by the factor 18/5. The 
agreement is satisfactory for a tree-level parton model calculation. 

From (20.127)-(20.130) we see that FYN — z FAN = 2x(ü 4- d), which is just 
the sea distribution; figure 20.9 shows that this is concentrated at small x, as 
we already inferred in section 9.3. 

We have mentioned QCD corrections to the simple parton model at several 
points. Clearly the full machinery introduced in chapter 16, in the context 
of deep inelastic charged lepton scattering, can be employed for the case of 
neutrino scattering also. For further access to this area we refer to Ellis et al. 
(1996), chapter 4, and Winter (2000) chapter 5. 
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FIGURE 20.9 

Comparison of neutrino results (experiments CCFRR, CDHSW and CHARM) 
on P5(x) and «F3(a) with those from muon production (experiments BCDMS, 
BFP and EMC) properly rescaled by the factor 18/5, for a Q? ranging between 
10 and 1000 GeV?; figure from K Winter (2000) Neutrino Physics 2nd edn, 
courtesy Cambridge University Press. 


20.7.3 Three generations 


We have seen in section 20.2.2 that the V-A interaction violates both P and 
C, and that it conserves CP in interactions with massless neutrinos. But 
we know (section 4.2.3) that CP-violating transitions occur, among states 
formed from quarks in the first two generations, albeit at a very slow rate. Is 
it possible, in fact, to incorporate CP-violation with only two generations of 
quarks? 

To answer this question, we need to go back and examine the CP-transfor- 
mation properties of the interactions in more detail. Rather than work with 
the current-current form, which is after all only an approximation valid for 
energies much less than Mw,z, we shall look at the actual gauge interactions 
of the electroweak theory. Given the form of those interactions, we want to 
know the condition for CP-violation to be present. 

Consider then the particular interaction involved in u + d transitions: 


Vay d W,, + V*,dy^à Wt — Vaag” rsd W, — Vda ys Wt, — (20.159) 
H ud m H ud H 


where W,, = (Wj — iW2)/V/2 destroys the W* or creates the W-. We have 
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written out the Hermitian conjugate terms explicitly, keeping the coupling 
Via complex for the sake of generality, and separating the vector from the 
axial vector parts. Problem 20.11 shows that the different parts of (20.159) 
transform under C as follows (normal ordering being understood in all cases): 


C:óy'd— —dy"t, y" 45d > dy" à, (20.160) 


and we also know that under C, W, > -Wi (the dagger is as in the charged 


scalar field case, and the minus sign is as in the photon Â, case). Hence under 
C, (20.159) transforms into 


Vaar û wi + vây ÂW, + Vaad n5 wi + Vig ysd Wa. (20.161) 


Under P, W,, behaves like an ordinary four-vector, so the ‘vector-vector’ prod- 
ucts in (20.161) are even under P, while the ‘axial vector-vector’ products are 
odd under P. Thus finally, under the combined CP transformation (20.159) 
becomes 


Vaar ů Wi + Vyd W, — Vaad n5 Wi — Váy" ysd Wp. (20.162) 


Comparing (20.159) with (20.162) we deduce the essential result that this 
interaction conserves CP if and only if 


Vaa V (20.163) 


that is, if the coupling is real. The same is true for all the other couplings Vi;. 
The couplings we have introduced in this chapter so far only involve the 
real Fermi constant Gg, and the elements of the Cabibbo-GIM matrix which 


enters into the relation between the weakly interacting fields (d', 8’) and the 
fields with definite mass (d, 8): 


d' cosÜc sinc d\ _ d 
( F4 ) ~ ( —sinOc cosc ) ( E ) = Voom ( E ) E. 
All these couplings are plainly real. But could we perhaps parametrize the 
(d', 3’) e (d, 8) differently, so as to smuggle in a complex, CP-violating, 
coupling? 

This is the question that Kobayashi and Maskawa asked themselves in 1972 
(Kobayashi 2009, Maskawa 2009). To answer it is not completely straightfor- 
ward, because we can always change the phases of the quark fields by inde- 
pendent constant amounts. A rephasing of the quark fields in the transition 
i o j with coupling Vi; changes Vi; by the phase factor expi(o; — aj). We 
need to know whether, after allowing for this rephasing of the quark fields, an 
‘irreducible’ complex coupling can remain. 

First of all, note that the matrix Vcc appearing in (20.164) is or- 
thogonal, and this property guaranteed the vanishing of tree-level neutral 
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strangeness-changing transitions, as we saw after (20.106). But this could 
just as well be achieved if the matrix was unitary. Now a general 2 x 2 matrix 
has 8 real parameters; unitarity gives 2 real conditions from the diagonal ele- 
ments of Voom Viem = I, and one complex condition from the off-diagonal 
elements, leaving four real parameters. If all the elements are taken to be 
real from the beginning, the matrix becomes orthogonal, as in (20.164), and 
depends on only one real parameter, the ‘rotation’ in the 2-dimensional d—8 
space. So in the general, unitary case, the matrix will have one real angle 
parameter, and three phase parameters. But we have four quark fields whose 
phases we can adjust. In fact, since only phase differences enter, we really only 
have three free phases at our disposal, but that is just enough to transform 
away the three phases in the unitary version of Vccrm, leaving it in the real 
orthogonal form (20.164). Kobayashi and Maskawa therefore concluded that 
the 2-generation GIM-type theory could not accommodate CP-violation. 

In a step which may seem natural now but was very bold in 1972, they 
decided to see if there was room for CP-violation in a 3-generation model. 
(Remember that there was no sign of any third generation particles at that 
time.) The matrix transforming from the mass basis to the weak basis is 
now a 3 x 3 unitary matrix V, with 18 real parameters. There are three real 
diagonal conditions from unitarity, and three complex off-diagonal conditions, 
leaving 9 real parameters. If the elements of V are taken to be real, one has an 
orthogonal (rotation) matrix, which can be parametrized by three real Euler 
angles. That leaves 6 phase parameters in the general unitary V. We also 
have 6 quark fields, with 5 phase differences which can be adjusted. Thus 
just one irreducible phase degree of freedom can remain in V, after quark 
rephasing. Consequently, the three-generation model naturally accommodates 
CP-violation in the quark sector: this was the great discovery of Kobayashi 
and Maskawa (1973). It was another four years before the existence of the b 
quark was established, and more than twenty before the t quark was produced. 

'The 3-generation matrix V, written out in full, is 


Vaa Vas Vab 
V=| Va Vs Vo |, (20.165) 
Via Vis Vib 


and is called the CKM matrix, after Cabibbo, Kobayashi, and Maskawa. 
Clearly, there is no unique parametrization of V. One that has now become 
standard (Nakamura et al. 2010) is (Chau and Keung 1984) 


—ió 


C12C13 $12C13 $13€ 
= ió ió 
V = | —512€23 — C12523513€ €12€23 — 812523813€ 823C13 
ió ió 
512823 — €12C23813€' —0€12823 — $12€23813€°  C23C13 


(20.166) 
where cj; = cos ij, Sij = sin 0;; with i, j = 1,2,3; the 0j; may be thought of as 
the three Euler angles in an orthogonal V, and 6 is the remaining irreducible 
CP-violating phase. In the limit 013 = 053 = 0, this CKM matrix reduces to 
the Cabibbo-GIM matrix with 015 = 0c. 
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However, it would also be desirable to have a measure of CP-violation 
that was independent of quark rephasing. Consider one of the off-diagonal 
unitarity conditions, 


Vaa Vab + Vea Vab + Vta Vip = 0. (20.167) 


(Note that the complex conjugate of this equation gives another, independent, 
condition.) The best-measured of these products is V.4V5; dividing by this 
quantity, (20.167) can be written as 


where VaVs yy 
td "tb ud "yp 

= = : 20.169 

zl Va Vi Z2 Vx ( ) 


When viewed in the complex plane, relation (20.168) is the statement that the 
vectors (1,0), z; and z2 close to form a triangle as shown in figure 20.10, one 
of 6 such unitarity triangles that can be formed. The area A of this triangle 
is 

1 1 Vaa V, Veg Vib 

A--I *) = -Im | 42-77 |. 20.170 

2 m(z227) 2 »( IVa IVa; ( ) 
Recalling that a rephasing multiplies Vi; by expi(a; — oj), we see that A is 
rephasing invariant; in particular, so is the numerator J where 


J = Im(Vaa Va VÄ VÄ) (20.171) 


is a Jarlskog invariant (Jarlskog 1985). J may be thought of as follows: (i) 
strike out the ‘c’ row and ‘s’ column of V; (ii) take the complex conjugate of 
the off-diagonal elements in the 2 x 2 matrix that remains; (iii) multiply the 
four elements and take the imaginary part. There is nothing special about 
this particular row and column: there are nine different ways of choosing to 
pair one row with one column, but all such Js are equal up to a sign, because 
of the unitarity of V. In the parametrization (20.166), J takes the form 


J= C1281223523C13513 sin ó, (20.172) 


which vanishes if any 6j; = 0, or 7/2, or if ô = 0 or m. 

The CKM matrix is an integral part of the Standard Model, and testing 
its validity is an important experimental goal. Various tests are possible. 
Consider first the magnitudes of the CKM elements. These must satisfy six 
relations following from the unitarity of V: namely, the sum of the squares 
of the absolute values of the elements of each row, and of each column, must 
add up to unity. 

The magnitudes of the six elements of the first two rows have been de- 
termined from measurements of semileptonic decay rates: for example, the 
amplitude for the tree-level process d > u + e^ + % is proportional to Vaa- 
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But non-perturbative strong interaction effects enter into the amplitudes for 
corresponding measured hadronic transitions, such as n — pF e^ + De or 
T > wW+e7> +p. In many cases these hadronic factors in the matrix 
elements can now be calculated by unquenched lattice QCD. 

The status of the experimental determination of the moduli |Vi;| is regu- 
larly reviewed by the Particle Data Group. The current results for the uni- 


tarity checks are (Ceccucci et al. 2010) 


|Vaal? + |Vasl? +|Vanl? = 0.9999 + 0.0006 (20.173) 
[Vea]? + |Ves|? + [Ven]? = 1.101 + 0.074 (20.174) 
[Vaal? + |Veal? + [Vsa]? = 1.002 + 0.005 (20.175) 
IVa? + |Ves|? + |Visl? = 1.098 + 0.074. (20.176) 


Evidently these results are fully consistent with the CKM prediction of uni- 
tarity. 

The most accurate values of the nine magnitudes are obtained by a global 
fit to all the available measurements, imposing the constraints of 3-generation 
unitarity. The current result for the magnitudes, imposing these constraints, 
is (Ceccucci et al. 2010) 


0.9428 + 0.00015 0.22534 0.0007 — 0.00347 * 0 00019 
V= | .30:2252:E 0.0007 0.973455 00018  0.0410+9:0047 , (20.177) 


0.00862 9-09020.. .0.0403 "0 Poe. 0.999152 0 Bones 


and the Jarlskog invariant is J = (2.91*0 11) x 1075. 
From (20.177) it follows that the mixing angles are small, and moreover 
satisfy a definite hierarchy 


1 015 > 053 >> 013. (20.178) 


In more physical terms, hadrons evidently prefer to decay semileptonically 
to the nearest generation. Also, because the elements Vab, Veb, Vsa and Vis, 
which connect the third generation to the first two, are all quite small, the 
physics of the first two generations is hardly influenced by the presence of the 
third. This reflects, in quantitative terms, the success of the Cabibbo-GIM 
description, and the fact that the CP-violation seen in the K-meson sector 
is so weak. CP-violation is much more visible in B physics, as Carter and 
Sanda (1980, 1981) were the first to suggest, and as we shall discuss in the 
following chapter. 

Consider now the complex-valued off-diagonal unitarity conditions, in par- 
ticular the condition (20.168). Following Wolfenstein (1983), we identify s12 as 
the small parameter A, and write Vi, c s23 = AA? and Vab = si3exp(—ió) = 
AX (p — in) with A œ 1 and |p — in| < 1. This gives 

1— 2/2 À AX? (p — in) 
V= —A 1— X2 AX? : (20.179) 
AX(1—p-—iy | —AX 1 
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(p, 7) 


(0, 0) 


FIGURE 20.10 
The unitarity triangle represented by (20.168). 


neglecting terms of order A* and higher. Then 


t ud Vai 
zs Mab a ptin-1, z= VadVab —(p+in), J c A?\°n. (20.180) 
Vea Vi 


zl 
Vea Va, 


The unitarity triangle represented by the condition (20.168), or alternatively 
—21 — 29 = 1, is therefore a triangle on the base (1,0), with sides p + in and 
1—(p+in). Buras et al. (1994) showed that including terms up to order \° 
changes (p,n) to (p, ij) where p = (1—A?/2)p, 7 = (1—A?/2)n. The top vertex 
of the triangle in figure 20.10 is therefore at the point (p, 7j). The angles a, 8 
and y (also called $5, 91 and $3) are defined by 


E a on 
a = $9 = arg (At) x arg — (£3) (20.181) 


Vaa Vab pci 
e) ( ) 
= —arg|-— £ zx arg | —————— 20.182 
p= sae (ye) a(r) ^ (oum 
Vaa V3 MEI 
y= ¢3 S arg | -= | x arg(p + iñ) (20.183) 
Vea Veb 


The sides of this triangle are determined by the magnitudes of the CKM 
elements, and so another check is provided by the condition that the three 
sides should close to form a triangle. Further independent constraints are 
provided by measurements of the angles a, 3, and y which are directly related 
to CP-violation effects, as we shall discuss in the following chapter. Figure 
20.11 shows a plot of all the constraints in the p, ij plane from many different 
measurements (combined following the approach of Charles et al. 2005 and 
Hocker et al. 2001), and the global fit, as presented by Ceccucci et al. (2010). 
The annular region labelled by |Va»| represents, for example, the uncertainty 
in the determination of |z2| = |Vaa Vä / Vea Vä |, which is principally due to the 
uncertainty in |Vab|. The region labelled by Ama represents the constraint on 
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excluded area has CL » 0.95 : 


1.0 


0.5 


I= 0.0 


-1.0 
sol. w cos 2p « 0 
(excl. at CL > 0.95) 
-1.5 
-1.0 


FIGURE 20.11 

Constraints in the p, 7j plane. The shaded areas have 95% CL. [Figure repro- 
duced, courtesy Michael Barnett for the Particle Data Group, from the review 
of the CKM Quark-Mixing Matrix by A Ceccucci, Z Ligeti and Y Sakai, sec- 
tion 11 in the Review of Particle Physics, K Nakamura et al. (Partcle Data 
Group) Journal of Physics G 37 (2010) 075021, IOP Publishing Limited.] 
(See color plate III.) 


lz| = IVa V /VeaV,3,|, where |Via| is deduced from the value of the B? — B? 
mass difference Amg measured in B? — B? oscillations mediated by top-quark 
dominated box diagrams (see section 21.2.1 in the following chapter); here 
the uncertainties are dominated by lattice QCD. Figure 20.11 represents an 
enormous experimental effort, especially in the decade 2000-2010. The 95% 
CL regions all overlap consistently. It is quite remarkable how the single 
CP- violating parameter, three-generation scheme of Kobayashi and Maskawa 
(1973) has withstood this searching test. 
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FIGURE 20.12 
Effective four-fermion non-leptonic weak transition at the quark level. 


FIGURE 20.13 
Non-leptonic weak decay of A? using the process of figure 20.12, with the 
addition of two 'spectator' quarks. 


E: SSe 


20.8 Non-leptonic weak interactions 


The CKM 6-quark charged weak current, which replaces the GIM current 
(20.101), is 
4 zo QQ0- 4 x Q,Q0- = í,(1— 05); 
Jom (u, d, s, c, t, b) = iy, C 09) X + Bye 08) Wy + fy C09) zy. 
(20.184) 
and the effective weak Hamiltonian of (20.92) (as modified by CKM) clearly 
contains the term T 
$ E^ ^ 
Héc(2) = alex 3er (2) (20.185) 
in which no lepton fields are present (just as there are no quarks in (20.40)). 
This interaction is responsible, at the quark level, for transitions involving 
four quark (or antiquark) fields at a point. For example, the process shown in 
figure 20.12 can occur. By ‘adding on’ another two quark lines u and d, which 
undergo no weak interaction, we arrive at figure 20.13, which represents the 


non-leptonic decay A9 — pr^. 
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This figure is, of course, rather symbolic since there are strong QCD 
interactions (not shown) which are responsible for binding the three-quark 
systems into baryons, and the qq system into a meson. Unlike the case of 
deep inelastic lepton scattering, these QCD interactions can not be treated 
perturbatively, since the distance scales involved are typically those of the 
hadron sizes (~ 1 fm), where perturbation theory fails. This means that 
non-leptonic weak interactions among hadrons are difficult to analyze quan- 
titatively, though progress can be made via lattice QCD. Similar difficulties 
also arise, evidently, in the case of semi-leptonic decays. In general, one has 
to begin in a phenomenological way, parametrizing the decay amplitudes in 
terms of appropriate form factors (which are analogous to the electromagnetic 
form factors introduced in chapter 8). In the case of transitions involving at 
least one heavy quark Q, Isgur and Wise (1989, 1990) noticed that a consid- 
erable simplification occurs in the limit mo — oo. For example, one universal 
function (the ‘Isgur—Wise form factor’) is sufficient to describe a large number 
of hadronic form factors introduced for semi-leptonic transitions between two 
heavy pseudoscalar (07) or vector (17) mesons. For an introduction to the 
Isgur- Wise theory we refer to Donoghue et al. (1992). 

The non-leptonic sector is, however, the scene of some very interesting 
physics, such as K? — K? and B? — B? oscillations, and CP violation in the 
K? — K®, D? — D? and B? — B? systems. We shall discuss these phenomena in 
the following chapter. 


SSS E 


Problems 


20.1 Show that in the non-relativistic limit (|p| < M) the matrix element 
Upy"Un of (20.2) vanishes if p and n have different spin states. 


20.2 Verify the normalization N = (E + |p|)'/? in (20.23). 
20.3 Verify (20.30) and (20.31). 
20.4 Verify that equations (20.32) are invariant under CP. 


3 


20.5 The matrix ys is defined by ys = iy°y!y273. Prove the following prop- 


erties: 


(a) 42 —1 and hence that 
(1 75)(1 — 55) = 0; 
(b) from the anticommutation relations of the other y matrices, show that 


155,94) = 0 
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20.6 


(c) 
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and hence that 
(1 -F y5)'Yo = (1 — s) 


and 
(1+ ys) ¥o% = "oru (1. + 75). 


Consider the two-dimensional antisymmetric tensor e;; defined by 
€12 = +1, €21 = —1, €11 = €22 = Q. 


By explicitly enumerating all the possibilities (if necessary), convince 
yourself of the result 


€ij€kt = +1(dindjt — oiójk.). 
Hence prove that 
EijEil = djl and €jj€ij = 2 


(remember, in two dimensions, 57, ĝi; = 2). 
By similar reasoning to that in part (a) of this question, it can be shown 
that the product of two three-dimensional antisymmetric tensors has 
the form 
Cijk€lmn = | Ojt Ojm Ôj 
Okt Okm Okn 


Prove the results 


Ójm Ojn 


Cijk€ijn = 20kn EijkEijk = 3! 
MEE M jk€ij jk€ij 


CijkCimn = | 


Extend these results to the case of the four-dimensional (Lorentz) 
tensor Euvap (remember that a minus sign will appear as a result of 
€0123 = +1 but gul29 — —1). 


20.7 Starting from the amplitude for the process 


Yate >H tve 


given by the current-current theory of weak interactions, 


M = -i(Gre/V2)ü(u)y, (1 — 5)u(v,)g"" (ve) (1 — 9s)u(e). 


verify the intermediate results given in section 20.5 leading to the result 


do/dt = G2 /n 
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(neglecting all lepton masses). Hence show that the local total cross section 
for this process rises linearly with s: 


o = G2s/n. 


20.8 The invariant amplitude for + — ety decay may be written as (see 
(18.52)) 

M = (Gr Vaa) fap Uv) Yu (1 — ¥5)v(e) 
where p" is the 4-momentum of the pion, and the neutrino is taken to be 


massless. Evaluate the decay rate in the rest frame of the pion using the 
decay rate formula 


T = (1/2m,)|M|?dLips(m2; ke, ky). 
Show that the ratio of tt > etv and rt — u*v rates is given by 
Tint => etv) | = (4 — "Py 
I(r*—pgtv) Xm, m2 —m?] ` 
Repeat the calculation using the amplitude 


M' = (GeVaa)fap"ü(v)vyu(gv + gays ole) 


and retaining a finite neutrino mass. Discuss the e* /u* ratio in the light of 
your result. 


20.9 


a erify that the inclusive inelastic neutrino-proton scattering differen- 
Verify that the inclusive inelasti i ing diff 
tial cross section has the form 
dol) G2B 7. ty Wek 
sous = De (wá ) cos? (6/2) + W(?2 sin?(6/2) 
k 4 k' 
NERA 


Ld) 


in the notation of section 20.7.2. 


(b) Using the Bjorken scaling behaviour 
vWi?—F? MWP FO LW) = FP 


rewrite this expression in terms of the scaling functions. In terms of 
the variables z and y, neglect all masses and show that 
2 2 
dg G? 


= E angeli (qu PO gy? FO)G-y/2y]. 
30mm 5; «£2 (1— y) +F wy EE (1-— y/2)yx] 


Remember that 
k' sin?(0/2) — xy 


M 2c 
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(c) Insert the Callan—Gross relation 
nF) = FP 


to derive the result quoted in section 20.7.2: 


Po”) G2 malie y)? oF) 1-(1-y)? 
drde ax 4 2 ^) 2 
y F! 


20.10 The differential cross section for v„q scattering by charged currents 
has the same form (neglecting masses) as the v,e^ — pu Ve result of problem 
20.7, namely 

do G2 
ad ee 

(a) Show that the cross section for scattering by antiquarks v,,q has the 

form 


do, _ G2. 2 
ay 3) = ES =y)“. 


(b) Hence prove the results quoted in section 20.7.2: 


d?a 


Gh 2 
qu = — sad (x — Q°/2Mv) 


and 2 a 
AO dB EDI. ae — 02 
ig (YO) = sal - Pos - Q?/2Mv) 


(where M is the nucleon mass). 
(c) Use the parton model prediction 
d2 (v) G2 
ow! _ Gr, 


day = selate) +10) - Y] 


to show that 
FÍ? = 2a[q(x) + q(a)] 


and 
zFÍ^(x)  q(z)- Ga) 
F(z) ax) +4 


20.11 Verify the transformation laws (20.160). 
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CP Violation and Oscillation Phenomena 


In this chapter we shall continue with the phenomenology of weak interactions, 
introducing two topics which have been the focus of intense experimental effort 
in the recent decade: CP violation in B meson decays, and oscillations in both 
neutral meson and neutrino systems. In the following chapter we take up again 
the gauge theory theme, with the Glasow-Salam- Weinberg electroweak theory. 

CP violation was first discovered in the decays of neutral K mesons (Chris- 
tenson et al. 1964), but we shall not follow a historical approach to this sub- 
ject. Instead we shall concentrate on B-meson decays, where the effects are 
far larger, and much clearer to interpret theoretically than in the K-meson 
case. CP violation is reviewed in Branco et al. (1999), Bigi and Sanda (2000) 
and Harrison and Quinn (1998). We aim simply to illustrate the principles 
with some particular examples. In particular, we shall generally not discuss 
theoretical predictions; our main emphasis will be on describing selected ex- 
periments which have allowed determinations of the angles a, 8 and y of the 
unitarity triangle, figure 20.10. 

We saw in section 20.7.3 that, in the Standard Model, CP violation is 
attributable solely to one irreducible phase degree of freedom, 6, in the CKM 
matrix V. Clearly, to measure this phase, it is necessary (as usual in quantum 
mechanics) to create situations where it enters into the interference between 
two complex amplitudes. Two situations may be distinguished (Carter and 
Sanda 1980): 


(i) interference between two decay amplitudes B? — X and B? — X, 
where the B? and B? have been produced in a coherent state by mixing, 
and decay to a common hadronic final state X; 


(ii) interference between two different amplitudes for a single B-meson to 
decay to a final state X. 


Method (ii) (‘direct CP violation’) can be applied to charged as well as neutral 
mesons. 

The mixing in method (i) is formally similar to that involved in neutrino 
oscillations, which we treat after the meson case. We shall therefore start in 
section 21.1 with an example illustrating method (ii). We set up the mixing 
formalism and apply it to CP violation in B decays in section 21.2; we discuss 
K decays in section 21.3. Neutrino oscillations are treated in section 21.4. 
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Vag K* 


FIGURE 21.1 E 
Tree diagram contribution to B? — K*«- via the quark transition b — sui. 


wr 8 K* 


FIGURE 21.2 
Penguin diagrams (f — ü,c, 
transition b > Sut. 


eti 


) contributing to B? — Ktm” via the quark 
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21.1 Direct CP violation in B decays 


Consider the decays 
Bo>Kta and B'SK'-. (21.1) 


The first of these can proceed via the quark transitions shown in figure 21.1, 
which (in parton-like language) is a ‘tree-diagram’. Of course, long-distance 
strong interaction effects will come into play in forming the hadronic states 
B°, K+ and 77, and in final state interactions between the Kt and v^; we 
do not represent these strong interactions in figure 21.1, or in subsequent 
similar diagrams. We are specifically interested in the weak phase of figure 
21.1, since it is this quantity which changes sign under the CP transformation 
(Vi; — V;;), and this phase change will lead to observable CP violation effects. 
By contrast, the strong interaction phases — which will play an important role 
— will be CP invariant, but we do not need to display them yet. So we write 
the amplitude for figure 21.1 as 


AT(B? = K^) = ab Vas ta, (21.2) 


where the CKM couplings have been displayed. 
There are, however, three order-a loop corrections to figure 21.1, shown 
in figure 21.2, where f = i,¢ and t. We write the amplitude for the sum of 
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these three ‘penguin’ diagrams as 
Ap(B° > K*27) = Vu Va, Pa + Ves Ve Pe + Vis Vi Pr, (21.3) 


where p; is the penguin amplitude with f in the loop. It is convenient to use 
a unitarity relation to rewrite VisV,j, in terms of the other two related CKM 
products: 


Va Va, = —Vas Vii, — Vas Va, (21.4) 
so that the total amplitude becomes 
A(B? > Kr) = Va, VasTka + Ves Vo Pim, (21.5) 
where 
Tkr =tatPa- Pi, Pex = pe — pi- (21.6) 


In terms of the parametrization (20.179), (21.5) becomes 
A(B° ^ K*a^) = AM (p + in)Tk4 + AM (1 — 3/2) Ps. (21.7) 
Similarly, the amplitude for the charge-conjugate reaction is 
A(B? 3 K~at) = AM(p — in)Tk4 + AM (1— 32/2) Ps. (21.8) 
We can now calculate the decay-rate asymmetry 


|A(B? ^ K-«*)? — |A(B? 2 K*«-)? 
Aka = oS aaam (21.9) 
|A(B° > K-2zt)|? + |A(B? > Ktr=)|? 
To simplify things, let us take a common complex factor K out of the expres- 
sions (21.7) and (21.8) and write them as 


A(BÓ— K*a-) =  K(e^ + Rei(e-é2)) (2110) 
A(B° 4 K-at) = KETER EA, ait) 


where (see equation (20.183)) y is the phase of p+ in, R is real, and dp — ôr is 
the difference in (strong) phases between Px, and Tkr. Then we easily find 


2R siny sin(ôr = op) 


Akr = 1+ R2? + 2Rcosy cos(óy — dp)” 


(21.12) 


Thus we see that, for a CP-violating signal, there must be two interfering 
amplitudes leading to a common final state, and the amplitudes must have 
both different weak phases and different strong phases. An order of magnitude 
estimate of the effect can be made as follows. First, note that Px, is not 
ultraviolet divergent, since it is the difference of two penguin contributions; 
its magnitude is expected to be of order as/7 ~ 0.05. The tree contribution 
in (21.7) carries an extra factor of A? ~ 0.05 as compared with the penguin 
contribution, so that R is of order 1. This indicates that the asymmetry 
should be significant. 
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FIGURE 21.3 

Left-hand part: tree diagram contributions to B^ — D°K~ (upper diagram, 
via quark transition b — cüs), and to B~ — D?K- (lower diagram, via quark 
transition b — ucs). Right-hand part: decays of D? and D? to the common 
ata state. 


Indeed non-zero values of Ax, have been reported by both the BaBar and 
Belle collaborations: 


BaBar (Aubert et al. 2004) : Akr = —0.133 + 0.030 + 0.009 — (21.13) 
Belle (Chao et al. 2005) : Ax = —0.113+ 0.022+0.008 — (21.14) 


where the first error is statistical and the second is systematic. 

Altough Ax, is sensitive to the CP-violating angle y, it is not easy to 
extract y cleanly from these measurements. Both the tree and the penguin 
amplitudes involve non-perturbative factors for producing a particular meson 
state from the corresponding qq state; the strong phases are also not calcula- 
ble. 

A decay with no penguin contributions, but still with two interfering chan- 
nels, would have fewer uncertainties. (It is also less likely to be affected by new 
physics, which could provide short-distance corrections to penguin loops.) One 
such example is provided by the decays (i) B- — D?K- and (ii) B- —2 D°K~, 
which can interfere when the (D°K~) and (D°K~) states decay to a common 
final state. Here the quark transition in (i) is b — cüs, and in (ii) is b > ucs; 
in neither case is a penguin contribution possible. 

The tree-level diagrams which contribute are shown in the left-hand parts 
of figure 21.3 (we shall discuss the right-hand parts in a moment). We denote 
the amplitude for B- — D?K- by Ap, and note that Ag ~ AX. The 
amplitude for B- — D°K~, Ag, differs in three ways from Ap: (i) it is colour- 
suppressed by a factor 1/3 since the € and u have to be colour matched; (ii) 
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it contains the factor Va» VX ~ AA3(p — in); (iii) it will have a different strong 
interaction phase. With these factors in mind, we write 


Ap = rgApe 5-7 (21.15) 


where dg is the difference in strong phases between As and Ap, and rg is 
the magnitude ratio of the amplitudes. Since |p — in| ~ 0.38, rg is of order 
0.1-0.2, allowing for the colour suppression. 

Once again, the asymmetry is proportional to 


|1 + rgexpli(óg — y)] |? — |1 + rgexpli(óg + y)] |? ~ 4rg sinóg sin. (21.16) 


This involves y, but the relative smallness of rg tends to reduce the sensitivity 
to y. An alternative determination of y can be made (Attwood et al. 2001, 
Giri et al. 2003) by making use of three-body decays (to a common channel) 
of D? and D°, such as D9, D? — Kgata-. If we denote the amplitude for 
D? + Ksz*«- by A(s_,s4) (see figure 21.3), where s_ = (px + p,-)? and 
s, = (pk + p,+)? are the indicated invariant masses, then the amplitude for 
the BT to decay to K-Kg r+ via the D? path is! 


A_ = ApD[A(s_,s4) + rge O87 A(s,, s_)], (21.17) 
and the amplitude for the charge conjugate reaction Bt > Kg c ^ is 
A, = ApD[A(s4,s_) + reí ®t A(s, — s,)], (21.18) 


where D is the D meson propagator. The event rate for the B^ decay is then 
T_(s_,s+4) where (Aubert et al. 2008) 


esnek Perdu p 
2 [x .Re( A(s-,54)A* (s+, 5_)} + y_Im{A(s_, s4)A*(s4,5_)}] (21.19) 


and the rate for Bt decay is D,(s.., s+) where 


T4 (s-, 54) « |A(s+, 5-)^  r&]A(s-, 5. )]^- 
2 [x Ref{A(s+, s_)A*(s_,54)} + yzIm{A(s4, 5_)A*(s_,54)}]. (21.20) 


Here 


Il 


CH 


rpcos(óg — y) y- —rpsin(óp — y) (21.21) 


XL, = rpcos(óp- y), y+ — rpsin(Óp +7). (21.22) 


'The geometry of the CP-violating parameters is shown in figure 21.4. Note 
that the separation of the B^ and Bt positions in the (x,y) plane is equal to 


1We are neglecting D°-D° mixing and CP asymmetries in D decays, which are at the 
1% or less level (Grossman et al. 2005). 
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(z—,y-) 


(r+, y+) TB 


FIGURE 21.4 
Geometry of the CP-violating parameters £+, yx. 


2rp|sin |, and is a measure of direct CP violation. The angle between the 
lines connecting the B^ and B* centres to the origin (0,0) is equal to 24. 

If the functional dependence of both the modulus and the phase of A(s_, s+) 
were known, then the rates would depend on only three variables, rg, dp, and 
y (or equivalently on z+,y+). In fact, A(s_,s4) can be determined from a 
Dalitz plot analysis of the decays of D? mesons coming from D** — D?z* 
decays produced in e*e^ — cc events; the charge of the low-momentum m+ 
identifies the flavour of the D?. Such an analysis is a well-established tool 
in the study of three-hadron final states, originating in the pioneer work of 
Dalitz (1953), in connection with the decay K — 3m. The partial rate for 
D°(D°) — Ksrtr™ is (see the kinematics section of Nakamura et al. 2010) 


dT « [A(s.., s4)|?ds..ds,. (21.23) 


The physical region in the s_,s, plane is a bounded oval-like region, which 
would be uniformly populated if A(s_,s4) were a constant. In reality, the 
decay is dominated by quasi two-body states, in particular 


D- — K*(s.))«* (CA) 
— K**(s,)r- (DCS) 
> Kgp%(so), (CP) (21.24) 


where (CA) means CKM-favoured, (DCS) means doubly CKM-suppressed, 
and (CP) means that it is a CP eigenstate. The Dalitz plot shows a dense 
band of events at s- = ge corresponding to the K*~ resonance, a band 
at s} = mz.,, and a band at so = m2, where so = (pz + p,-)? and 
84 +s- + so = m2, + m, +2m2. 

The Dalitz-plot analysis proceeds by writing (Aubert et al. 2008) A(s_, s+) 
as a coherent sum of terms representing the quasi two-body modes, together 
with a non-resonant background. Once A(s_, s+) is known, it is inserted into 
T4(s—,s4+) to obtain (74, y+) from the Dalitz plot distributions of the signal 
modes of the BF decays. From these, the quantities rpg, ôg and finally ô can 
be inferred. 
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This method has been applied by both BaBar and Belle to determine y. 
Their most recently published results are 


BaBar (Aubert et al. 2010): y = 68+144+4+3° (21.25) 
Belle (Poluektov et al. 2010): y = 784*10$ 53-89? (21.26) 
where the last uncertainty is due to the D-decay modelling (which ignores, 


for example, rescattering among the three final state particles). Both these 
experiments use decays B^ + DK*,B* — D*K* with D* > Dr’ and D* > 
Dy; BaBar in addition uses the decays D? > KsKtK~. 

We now turn to the other main method of detecting CP violation, through 
the interference between decays of (for example) B? and B? mesons that have 
been produced in a coherent state by mixing. For this we need to set up the 
formalism describing time-dependent mixing. 


ee 


21.2 CP violation in B meson oscillations 


B°-B? oscillations have been studied by the BaBar and Belle collaborations at 
the PEP2 and KEKB asymmetric e*e^ colliders. These machines operate at 
a centre of mass energy equal to the mass of the T (45) resonance state, which 
is some 20 MeV above the threshold for B^ B" production. If produced in a 
symmetric e*e^ collider (with equal and opposite momenta for the e+ and 
e^), the produced B mesons would move very slowly, v/c ~ 0.06, covering a 
distance of only some 30 um before decaying (cr for B mesons is about 460 
um). This would make it impossible to resolve the decay vertices of the two 
Bs, as is required in order to observe B?-B? oscillations, since the accuracy of 
the decay vertex reconstruction is roughly 100 um. Oddone (1989) suggested 
making et e^ collisions with asymmetric energy colliding beams, so that the 
B mesons now move with the motion of the centre of mass, which can be 
considerable. For example, at PEP2 (e^ 9 GeV, et 3.1 GeV) Bem ~ 0.5 
and Yem ~ 1.15, so that the distance travelled in the (asymmetric) lab frame 
during the lifetime of an average B meson is ~ 250 um, which is measurable. 
At KEKB (e^ 8 GeV, e- 3.5 GeV), BemYem ~ 0.425. 

Since the Y(4S) state has J = 1, the decay Y — BB leaves the B mesons in 
a p wave state, which is forbidden for two identical spinless bosons; therefore 
one must be a B? and the other a B^, but we do not know which is which until 
one has been identified (‘tagged’) in some way. The flavour of the tagged B 
may be determined, for example, by the charge of the lepton emitted in the 
semi-leptonic decays B? — D-£*v,, B? + DH De. We shall not describe the 
evolution of the BB coherent state following production; interested readers 
may consult Cohen et al. (2009) for an instructive discussion, which also 
covers neutrino oscillations. We shall be interested in the time dependence of 
the state of the meson which partners the tagged meson, once the correlated 
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state has been collapsed by the tagging at time t = 0 say; the partner meson 
will be reconstructed by its decay products. Note that the partner meson 
can decay earlier or later than the tagged one; its state vector has that time 
dependence which ensures that it becomes the correlate of the tagged particle 
at t = 0. 


21.2.1 Time-dependent mixing formalism 


We denote the neutral meson by B (which will usually be B°, but could also be 
K? or D?), and its CP-conjugate by B. According to the theory of Weisskopf 
and Wigner (1930a, 1930b) (see also appendix A of Kabir 1968) a state that 
is initially in some superposition of |B) and |B), say 


Iv (0)) = a(0)|B) + &(0)[B), (21.27) 
will evolve in time to a general superposition 
|w(t)) = a(t)|B) + &(t)]B) (21.28) 


governed by an effective Hamiltonian H with matrix elements, in the 2-state 
subspace, 
T A p 
H=M-i5 =( GA ) (21.29) 
where M and T are Hermitian, and the equality M11 —il 11/2 = M22—il 22/2 = 
A follows from CPT invariance, which we shall assume. If CP is a good 
symmetry, then 


(BJH|B) = (B\(CP)~'(CP)H(CP)~!CPIB) 
(B[H|B) (21.30) 
so that p would equal q. Since M and T are both Hermitian, this would imply 
that M4» and T12 are both real; in the CP non-invariant world, this is not 


the case. 
The eigenvalues of H are 


wL = m — iIlL/2 = A + pq, wy = mg — ily /2 = A — pq, (21.31) 
and the corresponding eigenstates are 


IBr) (PIB) + alB))/(lpl? + a)? 
|Bu) = (PIB) — a[B))/(Ip? + la). (21.32) 


The states |B), |Bu) have definite masses my, my and widths Ty, and Ty. The 
widths [L,I 4 are equal to a very good approximation for B and D mesons, 
because the Q-values of both are large; in the case of K-mesons (see section 
21.3), one state decays predominantly to 27 and the other to 37, with different 
Q-values, and the lifetimes are very different. 


I 
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Suppose now that at time t = 0 the ‘tag’ shows that a B° has decayed. 
Then the partner is a B? at t = 0, described by the superposition 


|B°) = EI (Bu) —|Br)). (21.33) 


At a later time t in the B? rest-frame, this state evolves to (problem 21.1) 


[B?) = g+ (t) |B) + (n/a)g- (0|B?) (21.34) 
where 

g4(t) = eiste Tt? cos(Ampt/2) (21.35) 

g-(t) = ie-i"ste-T*7 sin(Ampt/2) (21.36) 


with mg = i(mn +m) and Amg = mg — my. Note, from (21.34), that the 
state which started as a B^ at t = 0 develops also a B? component at a later 
time. Similarly, if the tag shows that a B? has decayed, the partner meson at 
t = 0 is a B®, and its state evolves to 


|B?) = (4/p)g-(t)|B°) + g+ (t)|B°). (21.37) 


Consider first the semileptonic decays of B^ and B9, where the only tran- 
sitions that can occur are 


B'—ftwx, Bak 5X. (21.38) 


The state |B?) of (21.34), however, which was pure B? at t = 0, will be able to 
decay to a positively charged lepton via the admixture of the |B°) component; 
similarly negatively charged leptons may appear in the decay of |B?). From 
(21.34) and (21.37) we obtain directly the amplitudes for these ‘wrong sign’ 
transitions: E ; 

(£7 vi X [Ra ]BO) = (g/pyg-- (0) X Ral?) (21.39) 


and 
(^v X |Hsi|B?) = (/a)g- (t) (£* eX |a[B?), (21.40) 
where Hg is the relevant semileptonic part of the complete weak current— 
current Hamiltonian. Hence the semileptonic asymmetry is 
D(B)—4£*wX)-I(Bio4twX) 1-J|g/pl 
T(B? > £*wX)4T(BÜ—4£-vwX) 1-|g/p|l* 


Asi = (21.41) 


independent of time. In (21.41) we have used the fact that (¢~7X|Hg|B°) = 
(¢+veX|Hg|B°)*. The upper bound on Asr is of order 107? (Nakamura et al. 
2010). At the present level of experimental precision, it is a very good approx- 
imation to take |q/p| = 1. Since q/p = (Mh — iT 15/2)/(Mi — iP13/2)] 2, it 
follows that in this approximation we can neglect ['12, and the phase of q/p is 
just minus the phase of Mj». 
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FIGURE 21.5 i 
Box diagram contributions to B?-B° mixing. 


In the Standard Model, the B°-B° mixing amplitude occurs via the box 
diagrams of figure 21.5. The box amplitude is approximately proportional 
to the product of the masses of the internal quarks, and in this case the t 
quark contribution dominates (the magnitudes of the CKM couplings are all 
comparable). The phase of Mi» is then that of (V,;4Vin)?, which is the phase 
of ((1 — p — in)*)? in the parametrization of (20.179), which in turn is equal 
to the angle 28. Hence 

(a/p) =e", (21.42) 


neglecting terms of order A*. Equation (21.42) will be important in what 
follows. 

From (21.34) we can now read off that the probability that the state |B?) 
(which — we remind the reader — is the partner of the state tagged as a B? at 
t = 0, and which is pure B? at t = 0) decays as a B? at t Æ 0, is |g+(t)|? = 
exp(—It) cos? Am?t/2. Similarly, the probability that this state decays as a 
B? at time t is exp(—Tt) sin? Ampt/2, taking |(p/q)| = 1. Hence the difference 
in these probabilities, normalized to their sum, is cos Ampt. Measurements 
of this flavour asymmetry yield the value of Amp, currently (Nakamura et al. 
2010) 


Amp = 3.3337 + 0.033 x 107'° MeV. (21.43) 

More generally, we define decay amplitudes to final states |f) by 
As = (GUB?) , Ap = GIL [B9) (21.44) 
A; = (FIJB?) , Az = GIL B), (21.45) 


where CP|f) = |f) and Hyx is the weak interaction Hamiltonian responsible 
for the transition. We can now calculate the rates for |B?) to go to |f), and 
for |B?) to go to |f); up to a common normalization factor, which we omit, 
these rates are (problem 21.2) 


rT(Bi2f) = ze" (LÀ, + |(p/a) Asl? + (Ay)? — K(p/a) Ar^) cos Amnt 


+ 2Im(ÀA;2 47) sin Amst}, (21.46) 
p 
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and 
1 = T 
r(B? >f) = ze (Af + K(a/p) Ar + (IA; — \(a/p) Ag?) cos Ampt 


— 2Im(A, 1 A5) sin Ampt}. (21.47) 
p 


The rates to |f) are obtained by the substitutions Ay > Aj, Ap Aj. 

We can now derive the basic formulae for the time-dependent CP asym- 
metry of neutral B decays to a final state f common to B? and B? (problem 
21.3): 


ABE eye TUBE AE o 
Aj TBI fJ ET Srsin(Ampgt) — Cs cos Amg t) (21.48) 


2ImA; 1—|A;|? (+) 

LLL. eee ES —). 21.49 
EO IEA SO THAE s Ay aoe 
21.2.2 Determination of the angles a(¢2) and 6(¢,) of the 

unitarity triangle 


A very large number of measurements have been made, constraining the pa- 
rameters of the CKM matrix, or equivalently the unitarity triangle of figure 
20.10. We shall limit our discussion to those measurements which determine 
the angles a(¢2) and 8(61) of the triangle. 


(i) The angle 8 (01) 
One of the cleanest examples is the decay 
B9? > Jf 4- K$ 1. (21.50) 


The tree diagram is shown in figure 21.6(a), and the penguins in figure 21.6(b). 
The tree diagram contributes CKM factors V Ves = AA?(1—A?/2). Thef = ü 
penguin has factors Vä Vas = AA*(p—in) which is suppressed by two powers of 
A; it also carries a loop factor ~ a;/7, and it may therefore be safely neglected. 
'The other two penguins have the same weak phase as the tree diagram. Hence 
to a good approximation we can write the amplitude as 


Ayr = (Và; Ves)Tyk- (21.51) 


There is one subtlety: to get the two final states from B? and B? to interfere, 
we need K?-K? mixing to produce the (very nearly) CP eigenstates Kg (CP 
= +1) and K? (CP = —1). (We shall discuss the K? — K? system briefly 
in section 21.3.) This introduces a factor (q/p)k = (ViVes/VeaV&), quite 
analogously to (21.42), but its effect on Ayx is negligible. So, remembering 
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FIGURE 21.6 
Tree (a) and penguin (b) contributions to B? — J/~ + K$ ;, via the quark 
transitions b — ccs. 


that the relative orbital angular momentum of the two final state particles is 
L = 1, we have Ayx, = —exp(—2i8) and Sy = sin 28, while the J/vK? state 
has CP=+1 and S; = —sin28. Hence Sy measures —np sin 28, where ny 
is the CP eigenvalue of the J /VK$ 1 state: the sinusoidal oscillations in the 
asymmetry Ayx for the two modes S, L will have the same amplitude and 
opposite phase. 

Both BaBar and Belle have reported increasingly precise measurements 
of Ayk in these modes. The early results (Abashian et al. 2001, Aubert 
et al. 2001) were the first direct measurements of one of the angles of the 
unitarity triangle, offering a test of the consistency of the CKM mechanism 
for CP violation. Later measurements have achieved accuracies of about +5%. 
The current world average for sin 28 is (see the review by Ceccucci et al. in 
Nakamura et al. 2010) 


sin 28 = 0.673 + 0.023. (21.52) 


Figure 21.7 shows the asymmetry (before corrections for experimental ef- 
fects) for np = —1 and ny = +1 candidates as measured by BaBar (Aubert 
et al. 2007a); Belle has reported similar results. We should note that a mea- 
surement of sin 20 still leaves ambiguities in 6 (for example, 8 > 1/2 — 8), 
which can be resolved by other measurements (Ceccucci et al., in Nakamura 
et al. 2010). 


(ii) The angle a(¢2) 


The angle a is the phase between Vý Via and V4, Via. It can be measured in 
decays dominated by the quark transition b — u u d. Consider, for example, 
the decays B? + «*«-,B? > «*«-. Figure 21.8 shows the tree graph (a) 
and penguin (b) contributions to B? — «^«^. Exposing the weak phases as 
before, the amplitude is 

Aps Vib Vaa (t + pa — pe) + Vib Vta (pi — pc) 
= Vip VaaT4— + VinVia Pp. (21.53) 


Il 
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FIGURE 21.7 

(a) Number of nf = —1 candidates in the signal region with a B? tag (Ngo) and 
with a B? tag (Ngo), and (b) the measured asymmetry (Ngo — Ngo)/(Npo + 
Ngo), as functions of t; (c) and (d) are the corresponding distributions for 
the np = +1 candidates. Figure reprinted with permission from Aubert et al. 
(BaBar Collaboration) Phys. Rev. Lett. 99 171803 (2007). Copyright 2007 
by the American Physical Society. (See color plate IV.) 


FIGURE 21.8 
Tree graph (a) and penguin (b) contributions to B? — ata~, via quark 
transitions b — duu. 
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Suppose first that the penguin contributions could be neglected. Then the 
asymmetry A,+,- would measure 


AS. «a VapV* 
ImA,44,- = Im (=) — Im Ga eR) 


= Ime 20+) — gin 2a (21.54) 


where a is defined as m — 8—-y. Unfortunately, this simple result is spoiled by 
the penguin contributions, which there is no good reason to ignore. However, 
Gronau and London (1990) showed how an isospin analysis could disentan- 
gle the tree and penguin parts. The method involves the three amplitudes 
A..—., Aoo(B? > 797°), and A;o(Bt — s«*«). 

First of all, note that Bose statistics for the 27 states requires them to have 
only the symmetric isospin states J = 0 or 2, since the angular momentum is 
zero. Next, the effective non-leptonic weak Hamiltonian Hn acting in the tree 
diagram transition contains both AJ = 1/2 and AI = 3/2 pieces; combining 
with the initial J = 1/2 of the B meson, the first piece will lead only to the 
I =0 final state, while the second contributes to both J = 0 and I = 2 final 
states. However, since the gluon in the penguin diagrams carries no isospin, 
these diagrams can only change the isospin by AI = 1/2, which connects only 
to the J = 0 final state. The conclusion is that the J = 2 final state is free of 
penguins, and carries the pure tree phase. 

This information can be exploited as follows (Gronau and London 1990). 
First, the action of He on the B® state can be written as 


1 1 
V2 V2 
where as usual |173) is the state with isospin J and third component 7/3. Ex- 


panding the states mta~,a+n° and 7°7° in terms of definite isospin states, 
one finds (problem 21.4) 


5o. 4 
Hails 75) = —7 43/2120) + Fe A1/2|00) (21.55) 


1 1 
ic, ses E 
+ vB 3/2 V3 1/2 
v3 
Ato = -z 43/2 
dis = A Pd (21.56) 
00 >= V3 3/2 VB 1/2 E 


where Aj; is the amplitude (n^z|T4,| ^7). The rtz? state can have only 
I — 2, and arises solely from the tree diagram. Furthermore, the three complex 
amplitudes A,_,Ai9 and Aoo are expressed in terms of only two reduced 
amplitudes A3/5 and Aj/2, leading to one relation between them: 
S Ve aA (21.57) 
V2 de 00 — 44-F05 $ 
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FIGURE 21.9 
The triangle formed by the three amplitudes A;; in equation (21.57). 


which can be represented as a triangle in the complex plane, as shown in figure 
21.9. There is a similar triangle for the charge conjugate processes 


1 
v2 


where the A amplitudes are obtained from the As by complex conjugating the 
CKM couplings, the strong phases remaining the same as usual. 

Since Ajo is pure tree, its weak phase is well defined, namely that of 
Vš Vaa, which is y. It is convenient to define (Lipkin et al. 1991) A= 
exp(2iy)A, so that Ayo = Ayo. Then the two triangles have a common 
base, Aigo. The failure of the two triangles to overlap exactly is a measure 
of the penguin contribution. In principle, by measuring the asymmetry co- 
efficients S,+7-,C,+,-, the branching fractions of all three modes, and Coo, 
one can construct the triangles. But unfortunately the relative orientation of 
the triangles is not known, which leads to various possible solutions to o in 
the range 0 < a < 27. In addition, the data on 77° (with a branching ratio 
of order 1079) has sizeable experimental errors, and only a relatively loose 
constraint on o can be obtained. 

A much better constraint can be found from the CP asymmetries in 
B — rp decays (Snyder and Quinn 1993). The method is essentially a time- 
dependent version of the Dalitz plot analysis discussed in section 21.1. The 
available channels are 


A+- + Aoo = Ato, (21.58) 


B^ — {pr , p nrt, pr} > ntr m 
B' 2 {p-at, ptr, pnl} — rtr n? (21.59) 


where all result in the final state m+a~7° after the decay of the p mesons, 
and interferences following B?-B? mixing are possible. 

Returning then to equations (21.46) and (21.47), the rate for the 37 decay 
following a B? tag at t — 0 is 


2 1 _ 7 
a Sa Li laos! + [Ass P + (|Asel? — Ass?) cos Amet 


+ 2Im (24-4) sin Amp (21.60) 
p 
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and there is a similar formula, with appropriate changes, for the case of a B? 
tag at t = 0. We now write 


Aon = fals4) FT + f-(8-)F7 + fo(so)F" (21.61) 
and similarly 


Asr = f+(8+)F* + f-(s-)F^ + fo(so)F^, (21.62) 
where Ss, = (Dat + pro)”, $— = (Dr- + pao)?, 80 S (py + pr-)?, satisfying 
s4 +s + so =m} + 2m? , + m?,. fic(Sx) is the sum of three relativistic 
Breit-Wigner resonance amplitudes, together with appropriate angular mo- 
mentum and angle factors, corresponding to the p(770), (1450) and (1700) 
resonances. PF" is the amplitude for the quasi two-body mode B? — p*z*. 
Here « takes the values +, — and 0, and correspondingly & = —,+,0. The am- 
plitudes F^ are complex and include the strong and weak transition phases, 
from tree and penguin diagrams; they are, however, independent of the Dalitz 
plot variables. 

The pr states have the same decomposition into tree and penguin parts 
as discussed previously for the 77 states, namely 


F" = eT" +e P pr, (21.63) 


where the magnitudes of the weak couplings have been absorbed into 7T'^ and 
P*. We can rewrite (21.63) as 


eP F" = elope P" = A (21.64) 


and similarly l | "x Hm 
e (q/p)F* = -e^T* + P^ = AN. (21.65) 


Then (21.61) and (21.62) become 


E 


I 


Y fels) A" (21.66) 


(q/p) Ass 


II 


NO falsk) A", (21.67) 


disregarding a common overall phase e~'%. When (21.66) and (21.67) are 
inserted into (21.60), it is clear that one obtains many terms, for example 


Re(f.f*)Im(A* A* -A-A**), Im(f,f*)Re(AtA~*— A- A**), (21.68) 


and so on, in which different resonances interfere on the Dalitz plot. The 
strong, and known, rapid phase variation in these interference regions, via 
factors such as f+} f*, is a powerful tool for extracting the complex amplitudes 
A^, A", and hence via (21.64) and (21.65) the phase a. The quantities mul- 
tiplying the interference terms Re( f, /7) and Im(f, f2) are the key degrees of 
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freedom which allow this analysis to determine the penguin contributions and 
the strong phases, and hence a. However, the resonance overlap regions cover 
a small fraction of the Dalitz plot, so that a substantial data sample (a few 
thousand events) is needed to constrain all the amplitude parameters. 

An isospin analysis similar to that of the mm states can be done for the 
pn states, but now there is no reason to forbid the final state to have J = 1. 
Nevertheless, if charged B decays are also included, there are five physical 
amplitudes (p? — ntn, n n^ ,n?)m9,p* — ntr, pT — nan?) which are 
expressible in terms of two pure tree (AI — 3/2) transitions to 7 — 1,2 final 
states. One of the pure tree amplitudes may be written (Gronau 1991) as the 
sum At + A7 +2A°, and hence the ratio (A+ + A~ + 2A°)/(A+ + A- 4-24?) 
has the phase 2a. 

This approach has been followed by both BaBar and Belle, with the results 


BaBar (Aubert et al. 2007) a = 87145; (21.69) 
Belle (Kusaka et al. 2007) 68 < a < 95°. (21.70) 


These results are consistent with the values of 6 and y given in (21.52), (21.25) 
and (21.26), given the definition a = 7 — B — y. 

Of course, this is only one (at present not very tight) consistency check. 
But there are now very many independent measurements of the magnitudes of 
the CKM matrix elements, as well as the angles. We shall not describe these 
here, referring the reader to the regular updates by the Particle Data Group 
(currently Nakamura et al. 2010). We showed in figure 20.11 the 2010 plot of 
the contraints in the p,7 plane, presented by Ceccucci et al.. They concluded 
that the 95% CL regions all overlapped consistently around the global fit re- 
gion, though the consistency of |V"^| and sin 28 was not very good. It would 
be premature to make too much of the minor reservation, though it may be 
noted that sin 26 could be sensitive to new physics via short-distance correc- 
tions to the box diagrams of figure 21.5, while |Va»| is obtained from a tree- 
level process, and is thus unlikely to be affected by new physics. Overall, the 
consistency represented in figure 20.11 must be counted as a major triumph 
of the Standard Model, in particular of the original analysis by Kobayashi 
and Maskawa (1973). It must be remembered, though, that many extensions 
of the Standard Model allow considerable room for new CP-violating effects, 
which could be revealed by increasingly precise determinations of the CKM 
parameters. 
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21.3 CP violation in neutral K-meson decays 


Although the formalism is similar, the phenomenology of CP violation in neu- 
tral K-meson decays is very different from that in neutral B-meson decays. 
In the K case, CP violation is a very small effect, typically at the level of 
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parts per thousand or smaller; its observation by Cristenson et al. (1964) was 

a historic achievement. But the neutral K system is most simply (and tradi- 

tionally) approached by starting with the assumption that CP is conserved. 
We will define CP|K?) = —|K°); then the CP eigenstates are 


ES 
| v2 


The CP = 1 state can decay to two pions in an s-state, but not to three pions 
if (as we are assuming to start with) CP is a good symmetry; the situation 
is the opposite for the CP — —1 state. The Q-value for the three pion mode 
is very much smaller than for the two pion mode, with the result that the 
|K) state, decaying to two pions, has a much shorter lifetime than the |K_) 
state: Tar ~ 0.9 x 1071s, 73, ~ 5 x 1078s. Due to CP violation, the actual 
eigenstates |Kr) and |Kg) of the effective Hamiltonian are slightly different 
from |K4) (see (21.75) and (21.76)), with masses mg and mr, and widths T's 
and TL. At this point, however, we shall associate mg and Is with |K+), and 
my and Tr with |K_). 

A K? is produced in strangeness-conserving reactions such as Ktn + K?p, 
and a K? in KT + p > K? +n, for example. However, the two states can mix 
following production, since (as usual) it is the Hamiltonian eigenstates which 
propagate in free space, and they are the superpositions |K), assuming CP 
is conserved. Hence, as time proceeds following production, a state produced 
as a K? at time t — 0 will evolve into the state 


1 
K? = — 
IK;) 2 


IK) (IK?) x |K?) (21.71) 


(e TrLt/2 imzt je Tst/2 imst)|K?)+ (e DLt/2 imit I'st/2 imst) |KO), 


(21.72) 
The probability that a K? (K?) will then be observed at time t following pro- 
duction (in the K-meson rest frame) is 


Pg) = =[e Pe +e Ts* + (—)2e7 Fu T8972 cos Amt] (21.73) 
where Am = my — mg. This is the famous phenomenon of strangeness 
oscillations, predicted by Gell-Mann and Pais (1955). Experimentally, the 
strangeness of the state at time t is defined by the modes K? — «^ £*v; 
and K? — ntl De. The difference P, (t) — P. (t) is measured, and although 
the oscillations are heavily damped by exp(—Igt), the mass difference can be 
determined: 

Am = (3.483 + 0.006) x 1071? meV. (21.74) 


However, this is not the whole story. Christenson et al. (1964) found 
that, after many 7g lifetimes, some 27 events were observed, indicating that 
the surviving state Kr, was capable of decaying to 27 after all (albeit very 
rarely). The same conclusion follows from the fact that P, (t) — P. (t) does 
not go to zero at long times, as it should from (21.73). Accordingly, the true 
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Box diagrams contributing to K?-K° mixing. 


Hamiltonian eigenstates are not quite the CP eigenstates, but rather 


IK) = [(1+®|K°) + (1—®|K®)]/V/2(1 + el?) (21.75) 
IKs) [(1 + €)|K°) - à - €)|K)]//2(1 + lel’). (21.76) 
This is a traditional parametrization in K-physics, similar to that in (21.54) 


with q/p = (1 — €)/(1 + €) (this is why we chose CP |K?) = —|K°)). We now 
find that a state which starts at t = 0 as a K® evolves to 


II 


1—e 
loe 


IK?) = g.()]K^) + g-(t)|K*) (21.77) 


where l 
m (t) = en Ae ety, + Seem, (21.78) 


with AT =Ts - i, Am = my — ms, and we have omitted a normalization 
factor. Similarly, a state tagged as K? at t = 0 evolves to 


IK?) = ——e- (IK?) + g+ (DIR). (21.79) 


The K9-K? mixing amplitude arises in the Standard Model from the box 
graphs shown in figure 21.9 (cf figure 21.5). These contain factors of mẹ, 
but the magnitude of the four CKM couplings to the t quark are of order \!°, 
compared with Aê for the c quark, so that the c quark diagram dominates, with 
a CKM factor of (V.sV,4,)?, which is real to a good approximation. This means 
that Ime is very small. A comparison of the mass difference Am predicted 
from figure 21.10 and the experimental value is complicated by uncertainties 
in the hadronic matrix element. 

'The traditional reactions in which CP violation is probed in K decays are 
the 27 modes, where one looks for the existence of Kr, —^ 27. There is also 
the semileptonic asymmetry. Three common observables are defined by 


qme Hn K T Aa K 
noo = ULM und ny- = Sarak (21.80) 
(1959 |HnlKs) (ntr |HnlKs) 
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and " 
T(Kr — n £u) = T(K —m't ) (21 81) 


£ Ve 
T(K > mlte) - I(Kr > rto) 
) 


The experimental numbers are (Nakamura et al. 2010 


ôL = 


Inoo| = (2.221 + 0.011) x 107°, |ņ+-|= (2.232 + 0.011) x 107 (21.82) 


Arg noo © 43.5°, Arg ns em 43.5? (21.83) 


and 
di = (3.32 + 0:06) x 1079. (21.84) 


It is useful to describe the final 27 states in terms of their isospin, which 
then have a definite strong interaction phase. As noted in connection with the 
B decays, the allowed isospin states are only J = 0 and J = 2, and one has 


: 1 f 
Ag = wigs LEE |Ase'G» 99 — (21.85) 
Ap = ARosrtr saa eio 799) — Vias [eG2—792) (21.86) 


where the minus sign arises from our choice CP|K°) = —|K®°), and where ôr 
and o; are the strong and weak phases, respectively, for the state with isospin 


I. Also, 
1 Apleiotdo) — 42 | A, gi 02-02) 
Ago Ago. ,4040 = 3 | Agle OT POs — 3 |Agle re (21.87) 


. R I ; 2 ; 
dp = Axis 6.8 = EE |Ao eie 799 + Vi [Aa ei 92792. (21.88) 


The significant fact experimentally is that | A2|/|.Ao| ~ 1/22, a manifestation of 
the ‘AI = 1/2’ rule in this case (i.e. AI = 3/2 is suppressed; see, for example, 
Donoghue et al. 1992, section VIII-4). Inserting (21.85) and (21.86), (21.87) 
and (21.88), into (21.80) and retaining only first-order terms in | A2|/|Ao|, and 
treating ġo and $» as small, we find (problem 21.5) 


"oo = €-ióo— V2 a Q — o)e02 790) (21.89) 
z 1 JA]. TEE 
n+- = €+ido9 += J JA] i(d2 — doje l (21.90) 
These relations are usually written as 
Noo =6— 2e, ny- =et+e, (21.91) 
where 
/ el32 750) [A3] 


e€-—te€-r ido, € =i (do nd Qo). (21.92) 


V3 Mo 
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The merit of this manoeuvering is that the parameter €’ involves only the CP 
violation in the transition amplitude (‘direct CP violation’), while e involves 
both a transition phase and the mixing parameter €. 

What can experiment tell us about e and €'? Consider first ô. Assuming 
that |A(K? — tver” )| = |A(K9 > £7 v;r^)| and that A(K? > Dert) = 
A(K® — tver”) = 0, we find 


ôL = 2Ree/(1 + |z|?) ~ 2Ree, (21.93) 


so that ôr is sensitive to the same parameter as appears in the Kr, — mm 
decays. An interesting observable is the ratio between the ratios of the decay 
rates to m+ and n?n’ of Ks and Ky. One finds 


B (1 = mL) & Re (e'/e), (21.94) 


6 In+-|2 


which from equation (21.82) is another small number, approximately equal 
to 1.64 x 107°. In the years before the B factories opened, e’ was the only 
window into CP violation in the transition amplitude. But all the branching 
ratios in (21.94) are of order 10 ?, and establishing a non-zero value of €' was 
very difficult. The first claim for non-zero ¢’ was by the NA 31 experiment at 
CERN (Barr et al. 1993), a 3.5 standard deviation effect. But a contemporary 
experiment at Fermilab (Gibbons et al. 1993) found a result compatible with 
zero. The next generation of experiments produced agreement: 


Re(c/e) = (2.07) + 0.28) x 10^? Alavi-Harati et al. 2003 (KTeV) 
(21.95) 
Re(c/e) = (1.47 +0.22) x 107? Batley et al. 2002 (NA 48). (21.96) 


The current world average is (1.65 + 0.26) x 1073. Fits to all the data also 
yield (Nakamura et al. 2010) 


le] = (2.228 + 0.011) x 1073. (21.97) 


The experimental value of ô, gives us Ree ~ 1.66 x 1073, and we can deduce 
that arge ~ 7/4. The phase of e’ is 7/2 + ó9 — 69 which happens also to be 
approximately 7/4. It follows that €'/e is very nearly real. 

Comparison of these small numbers with theoretical predictions is com- 
plicated by hadronic uncertainties, and it is beyond our scope to pursue that 
issue. 

In closing this discussion of mesonic mixing and CP violation, we briefly 
discuss the charm sector. First, we note that D?-D? mixing has been observed 
(Aubert et al. 2007c, Staric et al. 2007, Aaltonen et al. 2008). CP-violating 
effects in charm decays have been generally expected to be very small. A 
rough estimate of the direct CP-violating asymmetries in D decays can be 
made following the method of section 21.1. Consider, for example, the decays 
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D? — K+K- and D? > K-K*. As in (21.5) and (21.10) , the amplitude for 
the first decay is 


A(D' > K*K^) = ViVaTkk + VA Va Pkk (21.98) 
T(1 + ry expt ®* 7»), (21.99) 


where rx is the relative magnitude of the penguin contribution, and 6x is 
the relative strong phase. The amplitude for the CP-conjugate process is the 
same, with y replaced by —y. The penguin contribution is CKM-suppressed 
by a factor VL Vas/V Vas ~ A^, and there is also a loop factor, so that rk 

would seem to be of order 1074. The asymmetry is then 
AP, = |A(D° 2 K*K-)? - |A(D° > pm) 2 (21.100) 

|A(D? + K+K~)|? + |A(D? 2 K-K*)?? 

= 2rgsinysin ôk, (21.101) 


which is indeed very small. A similar argument predicts the asymmetry in 
the decays D? — rtr and D? 2 mrt to be 


AD. = —2rg sin y sin ôk. (21.102) 


Recently, however, the LHCb collaboration has published a measurement 
of the difference between the time-integrated CP asymmetries in the KK and 
mm decays, which to a very good approximation can be identified with the 
difference between the direct asymmetries (21.101) and (21.102). The LHCb 
result is (Aaij et al. 2012) 


AD, — AD. = (—0.82 + 0.21 + 0.11)96, (21.103) 


which is substantially larger than the estimates (21.101) and (21.102). 

It is possible that this 3.5 c effect (the first evidence for CP-violation in 
the charm sector) indicates the presence of some new physics. However, it 
must be noted that the mass scale of the charm quark, m. ~ 1.3 GeV, is not 
large enough to be safely in the perturbative QCD regime (as indicated by 
the parameter Ayqg/m-), so that non-perturbative enhancements are possi- 
ble. CP-violation in the charm sector promises to be an interesting area for 
experimental and theoretical exploration. 


E: SSe 
21.4 Neutrino mixing and oscillations 
21.4.1 Neutrino mass and mixing 


Experiments with solar, atmospheric, reactor and accelerator neutrinos have 
established the phenomenon of neutrino oscillations caused by non-zero neu- 
trino masses, and mixing. We shall give an elementary introduction to this 
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topic, which is a highly active field of research in particle physics; there are 
analogies with the meson oscillations we have been considering. 

It is fair to say that in the original Standard Model the neutrinos were 
taken to be massless, but there was no compelling theoretical reason for this, 
and the framework of the Standard Model can easily be extended to include 
massive neutrinos. However, one question immediately arises: are neutrinos 
Dirac or Majorana fermions? As explained in section 20.3, we do not yet know 
the answer, and it may be some time before we do. The way the mass terms 
enter the Lagrangian is, in fact, different in the two cases. We are familiar 
with the Dirac mass term 


mýý = mda + dive); (21.104) 


where m is a four-component Dirac field, and R and L refer to the chirality 
components. We learned in section 7.5.2 that a Majorana mass term can be 
written in the form 

mxrLig»xr + h.c. (21.105) 


where xr, is a two-component field of L chirality. A similar expression could 
be written using a two-component R-chirality field. The difference in form 
between the Dirac and Majorana mass terms leads to a difference in the 
parametrization of neutrino mixing, as we shall see. 

Suppose, first, that the neutrinos are Dirac particles, with both L and R 
chiralities (or equivalently either helicity) for a given mass. We remind the 
reader that this is not ruled out experimentally, since the non-observation of 
the ‘wrong’ helicity component may be accounted for by the appearance of a 
suppression factor (m/ E), where m is a neutrino mass and E is an average 
neutrino energy (see section 20.2.2). We also assume that their interactions 
have the V-A structure indicated by the phenomenology of the previous chap- 
ter. Then only the L (R) chirality component of a neutrino (antineutrino) field 
feels the weak force; the R (L) component of a neutrino (antineutrino) field 
has no interactions of Standard Model type. But, just as in the quark case, it 
will in general be necessary to allow for the possibility that the L-components 
of the fields which have definite neutrino mass, call them fr, Por, Zar, are not 
the same as the fields £e, PuL, 2,1, which enter into the charged current V-A 
interaction. For Dirac neutrinos, we therefore write 


Ve Uei Ue2 Ves P P 
Dy = Uii U,2 U,a D» =U D» ; (21.106) 
P. L U; Uz2 Uz3 D3 L D3 L 


where the unitary matrix U is the PMNS matrix, named after Pontecorvo 
(1957, 1958, 1967), and Maki, Nakagawa and Sakata (1962). 

Now we showed in section 20.7.3 that the general 3 x 3 unitary matrix has 
three real (rotation angle) parameters, and 6 phase parameters, five of which 
we could get rid of by rephasing the quark fields by global U(1) transformations 
of the form q’ = exp(i@)g. Such rephasing transformations are equally allowed 
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for the charged leptons, and also for Dirac neutrinos, since evidently the mass 
term (21.104) is invariant under a global U(1) transformation %' = exp(i0)u. 
Hence the matrix U will, in this Dirac case, have a parametrization of the 
CKM form, with one CP-violating phase. 

The mixing described by (21.106) implies that the individual lepton flavour 
numbers Le, L,,, Lr are no longer conserved. However, since we are here taking 
the neutrinos to be Dirac particles, there will be a quantum number carried by 
Ve, Vu and v; which is conserved by the interactions. This could, for example, 
be the total lepton number Le + L, + L7, assigning L(v4) = 1 for a =e, 1,7, 
which would follow from invariance under the global U(1) transformation £^, = 
exp(i0)£,, £/, = exp(ió)?4, where ô is independent of the flavour a. 

This ‘Dirac’ option, though simple, may be thought uneconomical, how- 
ever. As noted, the R components of neutrino fields have no interactions of 
Standard Model type. The charged leptons do have electromagnetic interac- 
tions, of course, as do the quarks, which also have strong interactions. But 
the neutral neutrinos only have weak interactions, which involve only their 
L-components. Why, then, enlarge the field content to include hypothetical 
Pg fields, which don't have any SM interactions? It seems more economical to 
make do with only the £1, fields. In this case, the Dirac mass term (21.104) is 
not possible, but a Majorana mass term (21.105) can still exist. Clearly, such 
a mass term is not invariant under U(1) global phase transformations, and it 
breaks lepton number conservation explicitly. As in the Dirac case, the chiral 
L component will include a ‘wrong’ (i.e. positive) helicity component with an 
amplitude proportional to m/E. 

The fact that global phase changes on the neutrino fields are now no longer 
freely available, because that symmetry is lost if they are Majorana fields, has 
implications for the mixing matrix, call it Um, in this case. Since the three 
Majorana neutrino fields can no longer absorb phases, we have only the three 
phases from the charged leptons at our disposal, which leaves three phase 
parameters in Uy, after rephasing. The PMNS matrix in the Majorana case 
therefore has two more irreducible phase parameters than the CKM matrix, 
and is conventionally parametrized as 


Um = U(CKM - type) x diag.(1, e?21/2, e'%21/?), (21.107) 


There are three CP-violating phases in the Majorana neutrino case. 

The only information at present (2012) concerning the entries in U comes 
from neutrino oscillation experiments, which we shall discuss in the next sec- 
tions. We shall see that the Majorana phases o; and az; cancel in the 
probabilities calculated for neutrino transitions, and no experiment so far is 
sensitive to CP-violating effects in the neutrino sector. We shall discuss how 
the values of the parameters 015, 013 and 053 can be inferred from the observed 
oscillations, and also the differences in the squared masses of the neutrinos. 
Anticipating these results, we state here that the two independent squared 
mass differences, m2 — m? and m2 — m1, turn out to be very small indeed, and 
rather different from each other: namely approximately 7.6 x 107? eV? and 
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2.4 x 1073 eV’, respectively. The smaller value is associated with oscillations 
of solar or reactor neutrinos, and the larger with oscillations of atmospheric 
or accelerator neutrinos. 

Data on the actual mass values are limited. There is a bound on the 
De mass from measurements of the electron spectrum near the end-point in 
tritium £-decay, which gives (Lobashev et al. 2003, Eitel et al. 2005) 


mg, < 2.3eV 95%CL. (21.108) 


A weaker limit on m,, comes from measurements of the muon spectrum in 
charged pion decay: 
My, < 0.19 MeV 95%CL. (21.109) 


'The strongest upper bound comes from cosmology, assuming three neutrinos. 
The Cosmic Microwave Background data of the WMAP experiment, combined 
with supernovae data and data on galaxy clustering, can be used to obtain an 
upper limit on the sum of three neutrino masses (Spergel et al. 2007): 


3 
Y m, < 0.68 eV, 95%CL. (21.110) 
i41 


Taking the squared mass differences as indicative of the actual mass scale, 
neutrino masses are evidently very much smaller than the masses of the other 
fermions in the Standard Model. We shall return to what this might tell 
us about the origin of neutrino mass in section 22.5, where we discuss how 
gauge-invariant masses are generated in the Standard Model. 

Returning to the question of CP violation, we noted in section 4.2.3 that 
the CP violation present in the Standard Model was insufficient to account for 
the matter-antimatter asymmetry in the universe. However, we now see that 
it is possible to have CP violation in the lepton sector, in an extended Stan- 
dard Model with massive neutrinos. Leptonic matter-antimatter asymmetries 
can be converted into baryon asymmetries in the very hot early universe by a 
non-perturbative process predicted by Standard Model dynamics — a process 
called leptogenesis (Fukugita and Yanagida 1986, Kuzmin, Rubakov and Sha- 
poshnikov 1985). It has been argued that the Dirac and/or Majorana phases 
in the neutrino matrix U or Uy can provide the CP violation necessary in 
leptogenesis models for the generation of the observed baryon asymmetry of 
the universe (Pascoli et al. 2007a, 2007b). If such a proposal should prove 
to be the case, the reach of Pauli’s ‘desperate remedy’ will have been vast 
indeed. 


21.4.2 Neutrino oscillations: formulae 


The existence of neutrino oscillations means that if a neutrino of a given 
flavour vala = e, u, T) with energy E is produced in a charged current weak 
interaction process, such as 7+ — put v,, then at a sufficiently large distance L 
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from the Va source the probability P(va — vg; E, L) of detecting a neutrino of 
a different flavour vg is non-zero.” Such a flavour change will of course imply 
that the va survival probability, P(v4 — va; E,L), is less than 1. We shall 
give a simplified version of the derivation of such probabilities, following the 
approach of the review by Nakamura and Petcov in section 13 of Nakamura 
et al. (2010). This review includes a large list of references to the time- 
dependent formalism; we mention here the contributions of Kayser (1981), 
Nauenberg (1999) and Cohen et al. (2009). We shall treat all the neutrinos 
as stable particles. 

We consider the evolution of the state |v.) in the frame in which the 
detector which measures its flavour is at rest (the lab frame). As in the meson 
case, the states with simple space-time evolution in a vacuum are the mass 
eigenstates |v;) (i = 1, 2,3), a superposition of which is equal to |v4): 


Yo) = M Užalvi pi), (21.111) 
j 


the complex conjugation arising from taking the dagger of the relation (21.106) 
for the field operators. Here U stands for either the Dirac or the Majorana 
matrix, and p; is the 4- momentum of v;. Similarly, 


Es) = 3 Ua] pi) (21.112) 


We will consider highly relativistic neutrinos, as is the case for the experiments 
under discussion. We will assume that there are no degeneracies among the 
masses m;. The states in the superpositions (21.111) and (21.112) will all 
have, in general, different energies and momenta £E;,p;. We shall also treat 
the evolution as occurring in one spatial dimension, taking all the momenta 
to lie in the direction from the source to the detector. Note that the fractional 
deviation of E; and p; from the massless case E = p is of order m? / E? which 
will be extremely small, of order one part in 1016, say. 

Suppose now that the neutrinos of flavour v, started in the state (21.111) 
at time t = 0 in the detector frame are detected at time T after production, 
having travelled a distance L. Then the amplitude for finding a neutrino of 
flavour vg at (L, T is 


Alva > vg LE) = J Ui e IT TP (uus, pi) 
i 


ey UU (21.113) 
We make two immediate comments on (21.113). First, the Majorana phases 
in (21.54) cancel in A(v4 — vg; L, E) = dag, since the same phase appears 


2We shall not indicate the chirality explicitly from now on, it being assumed that we are 
referring to the L (R) component for neutrinos (antineutrinos). 
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in Ug; and Ugi. We conclude that oscillation experiments cannot distinguish 
Majorana from Dirac neutrinos. Second, if the neutrinos were massless, the 
phase factors in (21.113) would all be unity, and then A(v4 — vg; L, E) = bag, 
from the unitarity of the matrix U, so there would be no flavour change. 

Flavour oscillations come about via the interference in |A(vq — vg; L, T) 
between phase factors that are slightly different from one another, because of 
the different masses. A typical interference phase is then $;; = (E; — E;)T — 
(pi — p;)L. Following the review by Nakamura and Petcov in Nakamura et al. 
(2010), we note that 


| 2 


mi — m5 (E?-pi)-(Ej-pj) E; +E; 
i- LP L (mE EE (pip) (21.114) 
Pi + pj Pi + pj (pi + pj) 
so that 


E; + Ej z| m? — mj 


bi; = (E; — E; r- 
cct i) pit pj Pi + pj 


(21.115) 


Bearing in mind that the energies differ from the momenta by terms of or- 
der m?/E?, we see that the first term in (21.115) can be dropped, and the 
interference phase is, to a very good approximation, 


BO, A 21.116 
?u — 3g = 3E DUC 


where E is the average energy, or momentum, of the neutrinos. We therefore 
obtain the probability 


P(v4—vgL,E) = M,.|Ul Ug? (21.117) 


* * Ami; 
+ 2S [UG UT Ua; Us| cos ag | L7 banii 


i>j 


where 
Qo;ij = Arg (U5;U5; Ua; U5;). (21.118) 


A more useful expression can be obtained by using the unitarity of U (problem 
21.6): 


* eua AM; 
P(va > vg;L,E) = bag AM  Re(UgiUs; Ua; U5;) sin 


i>j 


] Am;L 
+ 25 Im (U4U2,U4;U5;) sin aE (21119) 


ij 


The expression for P(v, — Dg; L, T) is the same, except for a change in sign 
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ig Am;L 
P(V, > Vg;L,E) = ôap— AM Re(UgiUSUa;U5;) sin ip 
ij 


. "n Am;L 
— 29 Im (U4U2,U4;U$;) sin jg 21120) 


ij 


It follows from (21.119) and (21.120) that P(va > vg; L, E) = P(va > va; 
L, E), a consequence of CPT invariance. CP alone requires P(va > vg; L, E) 
= P(V4 — Vg; L, E). A measure of CP violation is provided by 


AUS = P(va > vg L, E) — P(V, — vg; L, E) 
Am? 
= Esc TT : Zi 
= 43 Im (Ua 05,0. U5;) sin TA (21.121) 


The reader will recognize the Jarlskog (1985) invariants in (21.121). In this 
3 x 3 mixing situation, which is exactly analogous to quark mixing, all these 
invariants are equal up to a sign, and (21.121) becomes (Krastev and Petcov 
1988) 


Agp = -AGP = AGP 
Am? Ami Am? 
= 4J, lsi 32 7 2 aT : 13 , 

J, ET 3E ) + sin ( 2E + sin 2E 

(21.122) 
where 

J, = Im(U,s Us; Uca Uf). (21.123) 
If any one mass-squared difference is zero, say Am3,, then Am3. = — Am, 


and the right-hand side of (21.122) vanishes: we need all three mass-squared 
differences to be non-zero, in order to get CP violation. 

In proceeding to discuss the experimental situation, it will be useful to 
define an ‘oscillation length’ A; (E) given by 


(E/GeV) 


Aij (E) = 2E/Am?2, x 0.4 ——L — À—- 
j( ) / Mij (Am?,/eV?) 


m. (21.124) 


In practice, the three-state mixing formalism can often be simplified, making 
use of what is now known about the neutrino mass spectrum. One squared 
mass difference is considerably smaller than the other: 


[Am2 | 7.6 x 1075 eV”, |Am3,| 2.4 x 10 3e V?. (21.125) 
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Suppose now that L/|A3i(£)| ~ 1, while L/|\21(£)| < 1. Then expression 
(21.119) reduces to (problem 21.7) 


P(v, > vg; L, E) 


Q 


A 2 
ba — 4[Uasl” [Sag — [Ug] sin? =EL 
= P(PL,— Pg L, E). (21.126) 


In particular, 
» E 2 ay. 2 Ams, 
P(De > De; L, E) = 1 — 4|Ue3| (1 — |Ues|*) sin TE» (21.127) 


which can describe the survival probability of reactor Des, for example. 

Adopting a parametrization of the form (20.166), with rows labelled by 
e, u and 7, and columns by 1, 2, and 3, |U¢3|? is sin? @.3, which is found 
experimentally to be small (see the following section). It is often a good 
approximation to set |Ue3| to zero, in which case |U,3|? = sin? 6,3. Then 
(21.126) gives the v, survival probability 


P(v, > vy; L, E) = P(v, > Vy; L, E) zs 1 — sin? 20,5 sin*(L/2A31(E)) 
(21.128) 
and the flavour-change probability 


P(v, > v; L, E) = P(X > D7; L, E) ~ sin? 20,3 sin? (L/2A31(E)). (21.129) 


In this approximation, P(v, — ve) = P(v, — De) = 0. Formulae (21.128) 
and (21.129) can be used to describe the dominant atmospheric v,, and Dp 
oscillations (see the following section), and the parameters 6,3 and Am3, (or 
Am2) are referred to as the atmospheric mixing angle and mass squared 
difference. The smaller mass squared difference Am3,, and the angle 0,5, are 
associated with solar ve oscillations. 

The formulae (21.128) and (21.129) are, in fact, exactly what a simple 
2-state mixing model would give. Suppose that the effective mixing matrix 
for the 2-state system has the form (see problem 1.6) 


( —acos20  asin20 ) 


asin20  acos20 (21.130) 


where rows are labelled by e, js and columns by 1, 3; then the survival proba- 
bility is just 

1 — sin? 20 sin?(La), (21.131) 
where we have taken L ~ T as before. We can therefore identify the mixing 


parameter as 
a = pAs(E- = Am (21.132) 
31 AE : 
Note that the energies are here measured relative to a common average energy; 


if |£) is the lighter eigenstate and |h} the heavier, then 
i) cos 0|Z) + sin 0|h) 
|v,) =  —sin6|£) + cos 6h). (21.133) 
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21.4.3 Neutrino oscillations: experimental results 


Historically, the search for neutrino oscillations began when experiments by 
Davis et al. (1968) detected solar neutrinos (from 8B decays) at a rate approx- 
imately one third of that predicted by the solar model calculations of Bahcall 
et al. (1968). Pontecorvo (1946) had proposed the experiment, in which the 
neutrinos are detected by the inverse -decay process v, +°7Cl — e7 +37 Ar. 
The Davis experiment used 520 metric tons of liquid tetrachloroethylene 
(C2Cl4), buried 4850 feet underground in the Homestake gold mine, in South 
Dakota. Davis’ findings provided the impetus to study solar neutrinos us- 
ing Kamiokande, a 3000 ton imaging water Cerenkov detector situated about 
one kilometre underground in the Kamioka mine in Japan. Indeed, 8B solar 
neutrinos were observed, and at a rate consistent with that of the Davis exper- 
iment (Hirata et al. 1989). Later results from the Homestake mine (Cleveland 
et al. 1998) reported a solar neutrino detection rate almost exactly one third 
of the updated calculations of Bahcall et al. (2001). 

In a separate development, Kamiokande also reported (Hirata et al. 1988) 
an anomaly in the atmospheric neutrino flux. Atmospheric neutrinos are 
produced as decay products in hadronic showers which result from collisions 
of cosmic rays with nuclei in the upper atmosphere of the Earth. Production 
of electron and muon neutrinos is dominated by the decay chain m+ > pt + 
Vu, HT + et +, + Ve (and its charge-conjugate), which gives an expected 
value of about 2 for the ratio of (vj, + Dp) flux to (Ve + Pe) flux.? While the 
number of electron-like events was in good agreement with the Monte Carlo 
calculations based on atmospheric neutrino interactions in the detector, the 
number of muon-like events was about one half of the expected number, at 
the 4o level. 

This muon-like defect (and the lack of an electron-like defect) was later 
confirmed at the 9c level by Super-Kamiokande (Fukuda et al. 1998). In this 
experiment, a marked dependence was observed on the zenith angle of the 
muon neutrinos. This angle is simply related to the distance travelled by the 
neutrinos from their point of production, which varies from about 20 km (from 
above the detector) to over 10,000 km (from below the detector). The Super- 
Kamiokande data was the first compelling evidence for neutrino oscillations. 
Interpreting their data in terms of a simple 2-state v, + v. model, as in 
(21.129), Fukuda et al. (1998) reported the values sin? 20,3 > 0.82, and 
5x 1074 < Am, < 6 x 1073 eV? at 90% CL. 

We will postpone further discussion of the solar neutrino deficit for the 
moment, since it is complicated by interactions of the neutrinos with the 
Sun's matter (see the following subsection). We proceed to describe some of 
the main results which have come from the analysis of data from neutrinos 
produced in terrestrial accelerators and reactors. 


3The detector could not measure the charge of the final state leptons, and therefore v 
and 7 events could not be discriminated. 
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We begin with the CHOOZ experiment, which was the first experiment to 
limit the value of 0,3 (Apollonio et al. 1999, 2003). CHOOZ is the name of 
a nuclear power station situated near the French village of the same name. 
The experiment was designed to detect reactor Des via the inverse f-decay 
reaction De + p — et +n. The signature was a delayed coincidence between 
the prompt e* signal, and the signal from the neutron capture. The detec- 
tor was located in an underground laboratory about 1 km from the neutrino 
source. It consisted of a central 5-ton target filled with 0.09 96 Gd-doped 
liquid scintillator; Gd-doping was chosen to maximize the capture of the neu- 
trons. The neutrino energy E was a few MeV, and L was 1 km. For these 
values 2A21(E) is greater than about 10 km, while 2A431(E) is about 0.3 km. 
The neglect of sin? L/2A31(E) is justified, and formula (21.127) can be used 
for the 7, survival probability. The experiment found no evidence for De dis- 
appearance, and reported the 90% CL upper limit of sin? 20.3 < 0.19, for 
|Am3,| = 2 x 1073 eV?. We shall for the moment set 0,5 to zero, and return 
to discuss its value at the end of the chapter. 

The mass squared range Am? > 2x 1073 eV? can be explored by accelerator- 
based long-baseline experiments, with typically E ~ 1 GeV and L ~ sev- 
eral hundred kilometres. The K2K (KEK-to-Kamioka) experiment was the 
first accelerator-based experiment with a neutrino path length extending hun- 
dreds of kilometres. A horn-focused wide-band v, beam with mean energy 
1.3 GeV and path length 250 km was produced by 12 GeV protons from 
the KEK-PS and directed to the Super-Kamiokande detector. In this case, 
L/2A31(E) ~ 107?, which may be neglected. Then formulae (21.128) and 
(21.129) may be used, in the approximation U.3 ~ 0. The K2K data showed 
(Ahn et al. 2006) that sin? 20,3 zz 1(0,3 = 7/4), and that |Am3,| had a value 
consistent with (21.125). 

The first evidence for the appearance of Ve in a Vv, beam was obtained by 
the T2K collaboration (Abe et al. 2011). The v, beam is produced using 
the high intensity proton accelerator at J-PARC, located in Tokai, Japan. 
The beam was directed 2.5? off-axis to the Super-Kamiokande detector at 
Kamioka, 295 km away. This configuration produces a narrow-band v,, beam, 
tuned at the first oscillation maximum E, = |Am3,|L/27 ~ 0.6 MeV, so 
as to reduce background from higher energy neutrino interactions. In the 
vacuum, the probability of the appearance of a ve in a v,, beam is given (in 
our customary effective 2-state mixing approximation) by (21.126) as 


A^ 2 
P(v, — ve; L, E) = sin? 0,5 sin? 26,5 sin? ma L; (21.134) 


P(v, — De; L, E) is given by the same expression. Taking |Am3,| = 2.4 x 
107? eV? and sin? 20,3 = 1, the number of expected v, events was 1.5 + 
0.3(syst.) for sin? 20,3 = 0, and 5.5 + 1.0 events if sin? 0,3 = 0.1. Six events 
were observed which passed all the ve selection criteria. As we will see in the 
following section, the value of sin? 20,3 = 0.1 is entirely consistent with direct 
measurements of this quantity reported in 2012. 
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Another long baseline accelerator experiment is MINOS at Fermilab. Neu- 
trinos are produced by the Neutrinos at the Main Injector facility (NuMI), 
using 120 GeV protons from the Fermilab main injector. The detector is a 
5.4 kton iron-scintillator tracking calorimeter with a toroidal magnetic field, 
situated underground in the Soudan mine, 735 km from Fermilab. The neu- 
trino energy spectrum from a wide-band beam is horn-focused to be en- 
hanced in the 1-5 GeV range. The current MINOS results yield |Am3,| = 
(2.32*002) x 107? eV”, and sin? 20,3 > 0.90 at 90 % CL (Adamson et al. 
2011). 

A second reactor experiment, KamLAND at Kamioka, was designed to 
be sensitive to the smaller squared mass difference Am3,, and thus to 0,5. 
The Kamioka Liquid scintillator AntiNeutrino Detector is at the site of the 
former Kamioka experiment. The detector is essentially one kiloton of highly 
purified liquid scintillator surrounded by photomultiplier tubes. Des are de- 
tected as usual via the inverse G-decay reaction v, +p > et +n. KamLAND 
is surrounded by 55 nuclear power units, each an isotropic Pe source. The 
flux-weighted average path length is L ~ 180 km, and the energy E ranges 
from about 2 MeV to about 8 MeV. For E = 3 MeV, 2A21(E£) ~ 30 km, which 
allows for more than one oscillation. In this case (21.119) reduces to 


P(e > De; L, E) = 1 — 4|Ue1 |? [Uc3]? sin?(L/2A21(E)) (21.135) 
assuming |U,3| z 0. In a parametrization of the form (20.166), this becomes 
P(De > De; L, E) = 1 — sin? 20,5 sin?(L/2A21(E)), (21.136) 


again a simple 2-state mixing result. Data shown in figure 21.11 (Abe et al. 
2008) gives 


[Amâ | = “268 Evie ais x 107 eV? (21.137) 
tan? ĝe =  0.56*010*010 (21.138) 


The KamLAND data showed for the first time the periodic behaviour of the 
De survival probability. 

We now return to the solar neutrino problem, taking up the story after 
Davis’ results. Some doubts remained as to whether the solar calculations 
could be absolutely relied upon, for example because of the extreme sensitivity 
to the core temperature (x T18). One particular class of v, could, however, 
be reliably calculated, namely those associated with the initial reaction pp > 
*H+e+ +ve of the pp cycle. Whereas the Davis experiments allowed detection 
of the higher energy ves (threshold 814 keV) from the B and Be stages of 
the cycle, the energy of the ves from the pp stage cuts off at around 400 
keV. Detectors using the reaction v, +"! Ga — e^ +"! Ge, which has a 233 
keV energy threshold, were built (GALLEX, GNO and SAGE); their results 
(Altman et al. 2005, Abdurashitov et al. 2009) are in agreement, and again 
much smaller than the (updated) Bahcall et al. (2005) prediction. 
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FIGURE 21.11 

Ratio of the background and geo-neutrino subtracted v, spectrum to the ex- 
pectation for no-oscillation, as a function of Lo/ E, where Lo = 180 km. Figure 
reprinted with permission from S Abe et al. (KamLAND Collaboration) Phys. 
Rev. Lett. 100 221803 (2008). Copyright 2008 by the American Physical So- 
ciety. 


In 1999, the Sudbury Neutrino Observatory (SNO) in Canada began ob- 
servation. This experiment used 1 kiloton of ultra-pure heavy water (D20). 
It measured §B solar ves via both the CC reaction v, +d — eT + p + p, and 
the NC reaction v +d — v +p +n, as well as elastic ve^ scattering. The CC 
reaction is sensitive only to ve, while the NC reaction is sensitive to all active 
neutrinos, as is ye” scattering. If the solar neutrino deficit were caused by 
neutrino oscillations, the solar neutrino fluxes measured by the CC and NC 
reactions would be significantly different. SNO found that, while the total 
neutrino flux was consistent with solar model expectations, the ratio of the 
Ve flux to the total neutrino flux was about 1/3 (Ahmad et al. 2001, 2002). 
'This number can be understood in terms of the effect of dense matter on the 
propagation of the Ves, as we now discuss. 


21.4.4 Matter effects in neutrino oscillations 


We have assumed in the foregoing that neutrinos propagate in vacuum be- 
tween the source and the detector. Since neutrinos interact only weakly, it 
might seem that this is always an excellent approximation. But in the same 
way that light travelling through a transparent medium can have its refractive 
index changed, so can a neutrino. In particular, the refractive index can be 
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different for v, and v,,. The difference in refractive indices is determined by 
the difference in the real parts of the forward v,e^ and v,e~ elastic scatter- 
ing amplitudes (Wolfenstein 1978). The essential point is that the scattering 
can be coherent, with the spins and momenta of the particles remaining un- 
changed. This means that the effect is going to be proportional to the density 
of electrons in the matter traversed, Ne. The scattering amplitude, in turn, is 
proportional to Gr, so that a figure of merit for the effect is given by the prod- 
uct Gp N,. This has the dimensions of an energy, and can be interpreted as an 
addition to the effective 2-state mixing matrix of (21.130). Detailed analysis, 
which we omit, shows that the correct addition is actually +/2Gp Ne, so that 
(21.130) is modified to 


—&m* cos20+ /2GpN, At sin20 
haria AE, (21.139) 

AM sin 20 TE cos 20 
where now Am? = m2 — mi, and 0 = 0.9. Two-state mixing now gives 

problem 21.8) a new mixing angle Om such that 
blem 21.8 ixing angle 6, such th 
tan 20 Am? cos 20 

tan 20m = —— p (21.140) 


1 m Ne} Nas BENT. 2/2GrE : 
and the mass eigenstates |1),,, |2)m correspond to the eigenvalue difference 
ma — m; = |Am?2,/2E| (cos? 20(1 — Ne/Nres)* + sin?20] 7. — (21.141) 


We see that although the new term is certainly very small, being propor- 
tional to Gp, nevertheless since Am? is very small also, a significant effect can 
occur. In particular, if it should happen that Ne zz Nres for some (0, E), then 
Om will be ‘maximal’ (Om = 7/4), irrespective of the value of the original 0. 
This is called ‘resonant mixing’ (Mikheev and Smirnov 1985, 1986). It implies 
that the probability for a v, — v, flavour change could be greatly enhanced 
over the vacuum value, which is proportional to sin? 20.2. A point to note, 
also, is that the corresponding formulae for v.s are obtained by replacing Ne 
by —N.; then, depending on the sign of Am? cos20,5, resonant mixing can 
occur for one or the other of ve or v, as they pass through matter, but not 
both. Similar considerations apply to the propagation of neutrinos through 
the earth, but we shall not pursue this here (see Nakamura and Petcov in 
Nakamura et al. 2010). 

In the case of solar neutrinos, the effect of the above modifications is quite 
simple. For the highest energy neutrinos, Ne >> Nyes at the centre of the 
sun, so that Om ~ 7/2 at production in the core, and the ve is in the heavier 
mass state |2)4,. On the way to the surface of the Sun, Ne will decrease, and 
a point will be reached when Ne = Nyes. Here the mass difference (21.141) 
reaches its minimum, and two limiting cases may be distinguished depending 
on the scale of the variation in the electron density, which has been assumed 
constant in (21.139)- (21.141). (i) If the density variation is slow enough that 
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at least one oscillation length fits into the resonant density region, then it 
can be shown that the state stays with state |2),, (‘adiabatic evolution’) until 
it reaches the surface of the Sun, when Om — 0,5. The probability that the 
neutrino will survive to the earth is then (using (21.133)) |(v.|2),, |? = sin? 6,5, 
which has a value of about 1/3. In the alternative limit, (ii), in which the 
oscillation length in matter is relatively large with respect to the scale of 
density variation, the state may ‘jump’ to the other mass state |1),, (‘extreme 
non-adiabatic evolution’), and then |(ve{1)m|? = cos? c2. These are clearly 
extreme cases, and numerical work is required in the general case. However, 
the data from SNO and other water Cerenkov detectors are consistent with 
the first (adiabatic) alternative, and with the value sin? 0.2 ~ 1/3. Note that 
the solar data imply that (m3 — m?) cos 26.2 > 0. 

By contrast, for the lowest energy neutrinos we can take 0,, z 0, so that the 
neutrinos are produced in the state cos 6.2|1) +sin 6.2|2), and propagate as in a 
vacuum, oscillating with maximum excursion sin? 20,5. The detectors average 
over many oscillations, giving a factor of 1/2, so that the survival probability 
for the low energy ves is 1 — 4 sin? 20.9 ~ 5/9. The Gallium experiments are 
sensitive to the lower energy neutrinos, and indeed record some 60-70% of the 
expected flux. 

In summary, the solar neutrino data are consistent with the interpretation 
in terms of neutrino oscillations, as modified by the Wolfenstein-Mikheev- 
Smirnov (MSW) effect. A global solar + KamLAND analysis yields best fit 
values (Aharmim et al. 2010) 


8: = 34.061716 Am, = 7.59! 020 x 1075 eV". (21.142) 


21.4.5 Further developments 


Despite the remarkable experimental progress in the studies of neutrino oscil- 
lations over the last decade, there still remain some basic gaps in our knowl- 
edge. Perhaps the most fundamental is the Dirac/Majorana nature of massive 
neutrinos. The most feasible (but very difficult) test is neutrinoless double 
G-decay (Ov GG-decay), already touched on in section 20.3. As noted there, the 
amplitude is proportional to an average Majorana mass parameter (m). Ex- 
periments place a lower bound on the half-life for the decay, which translates 
into an upper bound on (m). The most stringent lower bounds on the half- 
lives have been obtained with decays of "Ge (Klapdor-Kleingrothaus et al. 
2001), !??Te (Andreotti et al. 2011) and !9?Mo (Arnold et al. 2006). Lower 
bounds on the half-lives range from 1074 to 107° years, with corresponding 
upper bounds on (m) of the order of 0.5 eV. It should, however, be noted that 
some participants of the Heidelberg-Moscow experiment claimed the observa- 
tion of 0v 88 decay of Ge with a half-life of 2.23* 021 x 1075 years, from which 
they deduced (m) = 0.32 + 0.03 eV (Klapdor-Kleingrothaus et al. 2006). The 
GERDA experiment (Ur et al. 2011) should be able to check this claim after 
one year of running. Other experiments currently running, or planned, will 
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push the bound on half-lives up to 1076-10? years, and the upper bound on 
(m) down to magnitudes of the order of a few times 107? eV. 

A second crucial question concerns the magnitude of CP-violation effects 
in neutrino oscillations. We recall from (20.172) that this vanishes if sin 0,3 = 
0. As we saw earlier, CHOOZ set a 9096 CL limit sin? 20.3 < 0.17. A non-zero 
value of sin? 20,3 has now been observed by two groups, both iz, disappearance 
experiments: the Daya Bay collaboration (An et al. 2012) and the RENO 
collaboration (Ahn et al. 2012). Their reported results were 


sin? 20.3 = 0.09240.016+0.005 (Daya Bay) (21.143) 
sin? 20.3 = 0.11340.013+0.019 (RENO), (21.144) 


in a 3-neutrino framework. For this value of sin 0.3, it should be possible to 
detect a CP-violating difference in the probabilities for v, — Ve and v, — De, 
and it may be enough to sustain leptogenesis models. 

The value of sin 6.3 is also relevant to the determination of the sign of 
Am,; we shall mention just one possibility. We have seen that the MSW effect 
for solar neutrinos implies that m» > mı (using the fact that cos @e2 > 0), but 
the mass spectrum (for 3-neutrino mixing) could be ordered as mı < m» < ma 
(‘normal spectrum’) or as m3 « m; « mg (‘inverted spectrum’). We have 
ignored the terrestrial MSW effect, but it can be significant in long-baseline 
accelerator-based experiments, and could be exploited to determine the sign 
of mg — mı. In the vacuum, the probability of the appearance of a ve in 
a v, beam is given by (21.134) (in our customary effective 2-state mixing 
approximation). As in the solar case, these probabilities will be modified by 
the MSW effect, which will enhance (suppress) the appearance probability for 
neutrinos (antineutrinos) in the case of the normal spectrum, and vice versa 
for the inverted spectrum. Clearly if 0,3 were too small, the effect would be 
very hard to see, but the value in (21.143) and (21.144) makes this a realistic 
experiment; it formed part of the physics motivation for the NOvA experiment 
at Fermilab (Ayres et al. 2005). NOvA is a long-baseline neutrino oscillation 
experiment now under construction, which aims to detect the appearance of 
v, and v, in the NuMI muon neutrino beam. The beam from Fermilab is 
directed 14 mrad off-axis to a detector 810 kn away; the neutrino energy is 
narrowly peaked around 2.2 GeV. NOvA will also have sensitivity to leptonic 
CP-violation. 


a 
Problems 

21.1 Verify equation (21.34). 

21.2 Verify equations (21.46) and (21.47). 

20.3 Verify equations (21.48) and (21.49). 
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21.4 Verify equations (21.56). 
21.5 Verify equations (21.89) and (21.90). 
21.6 Verify equation (21.119 
21.7 Verify equation (21.126 


21.8 Verify equations (21.140) and (21.141). 
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The Glashow-Salam-Weinberg Gauge 
Theory of Electroweak Interactions 


22.1 Difficulties with the current—current and ‘naive’ IVB 
models 


In chapter 20 we developed the ‘V-A current—current’ phenomenology of weak 
interactions. We saw that this gives a remarkably accurate account of a wide 
range of data — so much so, in fact, that one might well wonder why it should 
not be regarded as a fully-fledged theory. One good reason for wanting to do 
this would be in order to carry out calculations beyond the lowest order, which 
is essentially all we have used it for so far (with the significant exceptions of 
the GIM argument, and box diagrams in M-M mixing). Such higher-order cal- 
culations are indeed required by the precision attained in modern high energy 
experiments. But the electroweak theory of Glashow, Salam and Weinberg, 
now recognized as one of the pillars of the Standard Model, was formulated 
long before such precision measurements existed, under the impetus of quite 
compelling theoretical arguments. These had to do, mainly, with certain in- 
principle difficulties associated with the current-current model, if viewed as a 
‘theory’. Since we now believe that the GSW theory is the correct description 
of electroweak interactions up to currently tested energies, further discussions 
of these old issues concerning the current-current model might seem irrele- 
vant. However, these difficulties do raise several important points of principle. 
An understanding of them provides valuable motivation for the GSW theory 
— and some idea of what is ‘at stake’ in regard to experiments relating to 
the Higgs sector, which has only recently begun to be explored (see section 
22.8.3). 

Before reviewing the difficulties, however, it is worth emphasizing once 
again a more positive motivation for a gauge theory of weak interactions 
(Glashow 1961). This is the remarkable ‘universality’ structure noted in chap- 
ter 20, not only as between different types of lepton, but also (within the con- 
text of CKM mixing) between the quarks and the leptons. This recalls very 
strongly the ‘universality’ property of QED, and the generalization of this 
property in the non-Abelian theories of chapter 13. A gauge theory would 
provide a natural framework for such universal couplings. 
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FIGURE 22.1 
Current-current amplitude for 7, + u^ — e+e. 


22.1.1 Violations of unitarity 


We have seen several examples, in chapter 20, in which cross sections were 
predicted to rise indefinitely as a function of the invariant variable s, which 
is the square of the total energy in the CM frame. We begin by showing why 
this is ultimately an unacceptable behaviour. 

Consider the process (figure 22.1) 


Du UT > De + e7 (22.1) 


in the current-current model, regarding it as fundamental interaction, treated 
to lowest order in perturbation theory. A similar process was discussed in 
chapter 20. Since the troubles we shall find occur at high energies, we can 
simplify the expressions by neglecting the lepton masses without altering the 
conclusions. In this limit the invariant amplitude is (problem 22.1), up to a 
numerical factor, 

M = Gg E?(14- cos) (22.2) 


where E is the CM energy, and 0 is the CM scattering angle of the e^ with re- 
spect to the direction of the incident u~. This leads to the following behaviour 
of the cross section (cf (20.83), remembering that s — 4E?): 


o~ GRE. (22.3) 


The dependence on E? is a consequence of the fact that Gp is not di- 
mensionless, having the dimensions of [M] ?. Its value is (Nakamura et al. 
2010) 

Gr = 1.16637(1) x 107? GeV~?. (22.4) 


The cross section has dimensions of [L]? = [M] ?, but must involve G2 which 
has dimension [M] ?. It must also be relativistically invariant. At energies 
well above lepton masses, the only invariant quantity available to restore the 
correct dimensions to c is s, the square of the CM energy F, so that o ~ Gg E?. 

Consider now a partial wave analysis of this process. For spinless particles 
the total cross section may be written as a sum of partial wave cross sections 


Am 


(dg 
J 


c (2J 4-1)|f;P (22.5) 
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where f; is the partial wave amplitude for angular momentum J and k is the 
CM momentum. It is a consequence of unitarity, or flux conservation (see, for 
example, Merzbacher 1998, chapter 13), that the partial wave amplitude may 
be written in terms of a phase shift à: 


f; = e?! sinó; (22.6) 


so that 
PEST (22.7) 


Thus the cross section in each partial wave is bounded by 
oy € An(2J +1)/k? (22.8) 


which falls as the CM energy rises. By contrast, in (22.3) we have a cross 
section that rises with CM energy: 


ow P’. (22.9) 


Moreover, since the amplitude (equation (22.2)) only involves (cos0)? and 
(cos 0)! contributions, it is clear that this rise in ø is associated with only a 
few partial waves, and is not due to more and more partial waves contributing 
to the sum in ø. Therefore, at some energy E, the unitarity bound will be 
violated by this lowest-order (Born approximation) expression for c. 

This is the essence of the ‘unitarity disease’ of the current-current model. 
'To fill in all the details, however, involves a careful treatment of the appropri- 
ate partial wave analysis for the case when all particles carry spin. We shall 
avoid those details. Instead we argue, again on dimensional grounds, that the 
dimensionless partial wave amplitude f; (note the 1/k? factor in (22.5)) must 
be proportional to Gr E?, which violates the bound (22.7) for CM energies 


E > G}? ~ 300GeV. (22.10) 


At this point the reader may recall a very similar-sounding argument made 
in section 11.8, which led to the same estimate of the ‘dangerous’ energy scale 
(22.10). In that case, the discussion referred to a hypothetical ‘4-fermi’ inter- 
action without the V-A structure, and it was concerned with renormalization 
rather than unitarity. The gamma-matrix structure is irrelevant to these is- 
sues, which ultimately have to do with the dimensionality of the coupling 
constant, in both cases. In fact, as we shall see, unitarity and renormalizabil- 
ity are actually rather closely related. 

Faced with this unitarity difficulty, we appeal to the most successful theory 
we have, and ask: what happens in QED? We consider an apparently quite 
similar process, namely e*e^ — utu in lowest order (figure 22.2). In chapter 
8 the total cross section for this process, neglecting lepton masses, was found 
to be (see problem 8.18 and equation (9.87)) 


o = Ano? /3E? (22.11) 
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FIGURE 22.2 
One-photon annihilation graph for e*e^ > pt pu”. 


which obediently falls with energy as required by unitarity. In this case the 
coupling constant a, analogous to Gp, is dimensionless, so that a factor E? is 
required in the denominator to give o ~ [L]?. 

If we accept this clue from QED, we are led to search for a theory of 
weak interactions that involves a dimensionless coupling constant. Press- 
ing the analogy with QED further will help us to see how one might arise. 
Fermi’s current-current model was, as we said, motivated by the vector cur- 
rents of QED. But in Fermi’s case the currents interact directly with each 
other, whereas in QED they interact only indirectly via the mediation of the 
electromagnetic field. More formally, the Fermi current-current interaction 
has the ‘four point’ structure 


‘Ge(bd) - (b) (22.12) 
while QED has the ‘three-point’ (Yukawa) structure 
evi Â. (22.13) 


Dimensional analysis easily shows, once again, that [Gr] = M ? while [e] = 
M°. This strongly suggests that we should take Fermi's analogy further, and 
look for a weak interaction analogue of (22.13), having the form 


ghdW’ (22.14) 


where W is a bosonic field. Dimensional analysis shows, of course, that [g] = 
M?. 

Since the weak currents are in fact vector-like, we must assume that the W 
fields are also vectors (spin-1) so as to make (22.14) Lorentz invariant. And 
because the weak interactions are plainly not long-range, like electromagnetic 
ones, the mass of the W quanta cannot be zero. So we are led to postulate 
the existence of a massive weak analogue of the photon, the ‘intermediate 
vector boson’ (IVB), and to suppose that weak interactions are mediated by 
the exchange of IVB's. 

There is, of course, one further difference with electromagnetism, which 
is that the currents in f-decay, for example, carry charge (e.g. 9,5" (1 — 
*55)Uv, creates negative charge or destroys positive charge). The ‘companion’ 
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FIGURE 22.3 
One-W~ annihilation graph for v, + u^ — vede. 


hadronic current carries the opposite charge (e.g. Pull — roys Ju destroys 
negative charge or creates positive charge), so as to make the total effective 
interaction charge-conserving, as required. It follows that the W fields must 
then be charged, so that expressions of the form (22.14) are neutral. Because 
both charge-raising and charge-lowering currents exist, we need both W* and 
W-. The reaction (22.1), for example, is then conceived as proceeding via 
the Feynman diagram shown in figure 22.3, quite analogous to figure 22.2. 

Because we also have weak neutral currents, we need a neutral vector 
boson as well, Z0. In addition to all these, there is the familiar massless 
neutral vector boson, the photon. Despite the fact that they are not massless, 
the WË and Z? can be understood as gauge quanta, thanks to the symmetry- 
breaking mechanism explained in section 19.6. For the moment, however, we 
are going to follow a more scenic route, and accept (as Glashow did in 1961) 
that we are dealing with ordinary ‘unsophisticated’ massive vector particles, 
charged and uncharged. 

We now investigate whether the IVB model can do any better with unitar- 
ity than the current-current model. The analysis will bear a close similarity 
to the discussion of the renormalizability of the model in section 19.1, and we 
shall take up that issue again in section 22.1.2. 

The unitarity-violating processes turn out to be those involving external 
W particles. Consider, for example, the process 


Vy +d, > Wt +W (22.15) 


proceeding via the graph shown in figure 22.4. The fact that this is experimen- 
tally a somewhat esoteric reaction is irrelevant for the subsequent argument: 
the proposed theory, represented by the IVB modification of the four-fermion 
model, will necessarily generate the amplitude shown in figure 22.4, and since 
this amplitude violates unitarity, the theory is unacceptable. The amplitude 
for this process is proportional to 


Marr = ^e," (ko, A)e7" (ki, A1)0(p2)y" (1 — 95) 
x Ah + n) (1 — gulp) (22.16) 


(pi — ki)? — m2 


where the €^ are the polarization vectors of the W’s: ¢,*(k2,Az2) is that 


372 22. The GSW Gauge Theory of Electroweak Interactions 


FIGURE 22.4 
u~ -exchange graph for v, + Pa > W* + W5. 


associated with the outgoing W^ with 4-momentum kz and polarization state 
Ag, and similarly for e}*. 

To calculate the total cross section, we must form |M\|? and sum over the 
three states of polarization for each of the W’s. To do this, we need the result 


Y^ eulk, AJ& (k, A) = — Gyn + kk, (My (22.17) 
A=0,+1 


already given in (19.19). Our interest will as usual be in the high-energy 
behaviour of the cross section, in which regime it is clear that the k,,k,/M¢ 
term in (22.17) will dominate the g,, term. It is therefore worth looking 
a little more closely at this term. From (19.17) and (19.18) we see that 
in a frame in which k” = (k°,0,0,|k|), the transverse polarization vectors 
€"(k, A = +1) involve no momentum dependence, which is in fact carried 
solely in the longitudinal polarization vector e^(k, A = 0). We may write this 
as 


kh Mw 


Ea ob) 
which at high energy tends to k"/Mw. Thus it is clear that it is the lon- 
gitudinal polarization states which are responsible for the k”k” parts of the 
polarization sum (12.21), and which will dominate real production of W’s at 
high energy. 

Concentrating therefore on the production of longitudinal W’s, we are led 
to examine the quantity 


4 


Mig. xj Ue — *s)(i— Kı) Kı Øı Kı(Øı— Kı) Ko po] (22.19) 


where we have neglected m,,, commuted the (1 — y5) factors through, and ne- 
glected neutrino masses, in forming 5 spins |Moo|?. Retaining only the leading 
powers of energy, we find (see problem 22.2) 


X IMool? ~ (g*/Mqwy)(pi + k2)(p2 - k2) = (g*/Myy) E*(1 — cos? 0) (22.20) 


spins 
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where E is the CM energy and 0 the CM scattering angle. We see that the 
(unsquared) amplitude must behave essentially as g?E?/M¢,, the quantity 
g^ / M&, effectively replacing Gf of the current-current model. The unitarity 
bound is violated for E > Mw/g ~ 300 GeV, taking g ^ e. 

Other unitarity-violating processes can easily be invented, and we have to 
conclude that the IVB model is, in this respect, no more fitted to be called a 
theory than was the four-fermion model. In the case of the latter, we argued 
that the root of the disease lay in the fact that Gp was not dimensionless, yet 
somehow this was not a good enough cure after all: perhaps (it is indeed so) 
‘dimensionlessness’ is necessary but not sufficient (see the following section). 
Why is this? Returning to M 3,4, for vy + WtW (equation (22.16)) and 
setting € = k,,/M for the longitudinal polarization vectors, we see that we are 
involved with an effective amplitude 


xg re) Bü-s) PG ocu). — Quan 


v(p2) Ka(1 — 9s)u(pi). (22.22) 


We see that the longitudinal e€'s have brought in the factors MIS which are 
‘compensated’ by the factor K», and it is this latter factor which causes the rise 
with energy. The longitudinal polarization states have effectively reintroduced 
a dimensional coupling constant g/Mw. 

What happens in QED? We learnt in section 7.3 that, for real photons, 
the longitudinal state of polarization is absent altogether. We might well 
suspect, therefore, that since it was the longitudinal W's that caused the ‘bad’ 
high-energy behaviour of the IVB model, the ‘good’ high-energy behaviour of 
QED might have its origin in the absence of such states for photons. And 
this circumstance can, in its turn, be traced (cf section 7.3.1 ) to the gauge 
invariance property of QED. 

Indeed, in section 8.6.3 we saw that in the analogue of (22.17) for photons 
(this time involving only the two transverse polarization states), the right- 
hand side could be taken to be just -g,,, provided that the Ward identity 
(8.166) held, a condition directly following from gauge invariance. 

We have arrived here at an important theoretical indication that what we 
really need is a gauge theory of the weak interactions, in which the W's are 
gauge quanta. It must, however, be a peculiar kind of gauge theory, since 
normally gauge invariance requires the gauge field quanta to be massless. 
However, we have already seen how this ‘peculiarity’ can indeed arise, if the 
local symmetry is spontaneously broken (chapter 19). But before proceeding 
to implement that idea, in the GSW theory, we discuss one further disease 
(related to the unitarity one) possessed by both current-current and IVB 
models — that of non-renormalizability. 


374 22. The GSW Gauge Theory of Electroweak Interactions 


FIGURE 22.5 
O(g^) contribution to vj V, — vy. 


22.1.2 The problem of non-renormalizability in weak 
interactions 


The preceding line of argument about unitarity violations is open to the follow- 
ing objection. It is an argument conducted entirely within the framework of 
perturbation theory. What it shows, in fact, is simply that perturbation theory 
must fail, in theories of the type considered, at some sufficiently high energy. 
The essential reason is that the effective expansion parameter for perturbation 
theory is EGY 2 Since EGY ? becomes large at high energy, arguments based 
on lowest-order perturbation theory are irrelevant. The objection is perfectly 
valid, and we shall take account of it by linking high-energy behaviour to the 
problem of renormalizability, rather than unitarity. We might, however, just 
note in passing that yet another way of stating the results of the previous two 
sections is to say that, for both the current-current and IVB theories, ^weak 
interactions become strong at energies of order 1 TeV’. 

We gave an elementary introduction to renormalization in chapters 10 and 
11 of volume 1. In particular, we discussed in some detail, in section 11.8, 
the difficulties that arise when one tries to do higher-order calculations in 
the case of a four-fermion interaction with the same form (apart from the 
V-A structure) as the current-current model. Its coupling constant, which we 
called Gp, also had dimension (mass) ?. The ‘non-renormalizable’ problem 
was essentially that, as one approached the ‘dangerous’ energy scale (22.10), 
one needed to supply the values of an ever-increasing number of parameters 
from experiment, and the theory lost predictive power. 

Does the IVB model fare any better? In this case, the coupling constant 
is dimensionless, just as in QED. ‘Dimensionlessness’ alone is not enough, it 
turns out: the IVB model is not renormalizable either. We gave an indication 
of why this is so in section 19.1, but we shall now be somewhat more specific, 
relating the discussion to the previous one about unitarity. 

Consider, for example, the fourth-order processes shown in figure 22.5, for 
the IVB-mediated process v, v,, — v,v,. It seems plausible from the diagram 
that the amplitude must be formed by somehow ‘sticking together’ two copies 
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(a) (b) 


FIGURE 22.6 
O(e*) contributions to e*e^ > e*e^. 


(a) (b) 


FIGURE 22.7 
Lowest-order amplitudes for e*e^ — yy: (a) direct graph, (b) crossed graph. 


of the tree graph shown in figure 22.4.! Now we saw that the high-energy 
behaviour of the amplitude v» —^ WtW- (figure 22.4) grows as E?, due to 
the k dependence of the longitudinal polarization vectors, and this turns out 
to produce, via figure 22.5, a non-renormalizable divergence, for the reason 
indicated in section 19.1 — namely, the ‘bad’ behaviour of the k” k” /M$, factors 
in the W-propagators, at large k. 

So it is plain that, once again, the blame lies with the longitudinal polar- 
ization states for the W’s. Let us see how QED - a renormalizable theory — 
manages to avoid this problem. In this case, there are two box graphs, shown 
in figures 22.6. There are also two corresponding tree graphs, shown in figures 
22.7(a) and (b). Consider, therefore mimicking for figures 22.7(a) and (b) the 
calculation we did for figure 22.4. We would obtain the leading high-energy 
behaviour by replacing the photon polarization vectors by the corresponding 
momenta, and it can be checked (problem 21.3) that when this replacement 


l'The reader may here usefully recall the discussion of unitarity for one-loop graphs in 
section 13.3.3. 
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FIGURE 22.8 
Four-point ete vertex. 


FIGURE 22.9 
Four-point vv vertex. 


is made for each photon the complete amplitude for the sum of figures 22.7(a) 
and (b) vanishes. 

In physical terms, of course, this result was expected, since we knew in 
advance that it is always possible to choose polarization vectors for real pho- 
tons such that they are purely transverse, so that no physical process can 
depend on a part of e, proportional to k,. Nevertheless, the calculation is 
highly relevant to the question of renormalizing the graphs in figure 22.6. T'he 
photons in this process are not real external particles, but are instead virtual, 
internal ones. This has the consequence that we should in general include 
their longitudinal (e, c k,) states as well as the transverse ones (see section 
13.3.3 for something similar in the case of unitarity for 1-loop diagrams). The 
calculation of problem 22.3 then suggests that these longitudinal states are 
harmless, provided that both contributions in figure 22.7 are included. 

Indeed, the sum of these two box graphs for e*e^ — ete™ is not diver- 
gent. lf it were, an infinite counter term proportional to a four-point vertex 
ete” — e*e- (figure 22.8) would have to be introduced, and the original 
QED theory, which of course lacks such a fundamental interaction, would not 
be renormalizable. This is exactly what does happen in the case of figure 
22.5. The bad high-energy behaviour of vp — Wt W- translates into a diver- 
gence of figure 22.5 — and this time there is no ‘crossed’ amplitude to cancel 
it. This divergence entails the introduction of a new vertex, figure 22.9, not 
present in the original IVB theory. Thus the theory without this vertex is non- 
renormalizable — and if we include it, we are landed with a four-field pointlike 
vertex which is non-renormalizable, as in the Fermi (current-current) case. 
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Our presentation hitherto has emphasized the fact that, in QED, the bad 
high-energy behaviour is rendered harmless by a cancellation between contri- 
butions from figures 22.7(a) and (b) (or figures 22.6(a) and (b)). Thus one 
way to ‘fix up’ the IVB theory might be to hypothesize a new physical process, 
to be added to figure 22.4, in such a way that a cancellation occurred at high 
energies. The search for such high-energy cancellation mechanisms can indeed 
be pushed to a successful conclusion (Llewellyn Smith 1973), given sufficient 
ingenuity and, arguably, a little hindsight. However, we are in possession of 
a more powerful principle. In QED, we have already seen (section 8.6.2) that 
the vanishing of amplitudes when an €, is replaced by the corresponding k, is 
due to gauge invariance: in other words, the potentially harmful longitudinal 
polarization states are in fact harmless in a gauge-invariant theory. 

We have therefore arrived once more, after a somewhat more leisurely 
discussion than that of section 19.1, at the idea that we need a gauge theory 
of massive vector bosons, so that the offending k” k” part of the propagator can 
be ‘gauged away’ as in the photon case. This is precisely what is provided by 
the ‘spontaneously broken’ gauge theory concept, as developed in chapter 19. 
There we saw that, taking the U(1) case for simplicity, the general expression 
for the gauge boson propagator in such a theory (in a ’t Hooft gauge) is 


1 — £)k^k" 
i |-g"" + ur n — M¥, + ic) (22.23) 


where £ is a gauge parameter. Our IVB propagator corresponds to the £ > 
oo limit, and with this choice of € all the troubles we have been discussing 
appear to be present. But for any finite € (for example € = 1) the high- 
energy behaviour of the propagator is actually ~ 1/k?, the same as in the 
renormalizable QED case. This strongly suggests that such theories — in 
particular non-Abelian ones — are in fact renormalizable. ’t Hooft’s proof 
that they are (t Hooft 1971b) triggered an explosion of theoretical work, as it 
became clear that, for the first time, it would be possible to make higher-order 
calculations for weak interaction processes using consistent renormalization 
procedures, of the kind that had worked so well for QED. 

We now have all the pieces in place, and can proceed to introduce the 
GSW theory, based on the local gauge symmetry of SU(2) x U(1). 


EE: SeSe 


22.2 The SU(2) x U(1) electroweak gauge theory 


22.2.1 Quantum number assignments; Higgs, W and Z 
masses 


Given the preceding motivations for considering a gauge theory of weak in- 
teractions, the remaining question is this: what is the relevant symmetry 
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group of local phase transformations, i.e. the relevant weak gauge group? Sev- 
eral possibilities were suggested, but it is now very well established that the 
one originally proposed by Glashow (1961), subsequently treated as a spon- 
taneously broken gauge symmetry by Weinberg (1967) and by Salam (1968), 
and later extended by other authors, produces a theory which is in remarkable 
agreement with currently known data. We shall not give a critical review of 
all the experimental evidence, but instead proceed directly to an outline of 
the GSW theory, introducing elements of the data at illustrative points. 

An important clue to the symmetry group involved in the weak interac- 
tions is provided by considering the transitions induced by these interactions. 
This is somewhat analogous to discovering the multiplet structure of atomic 
levels and hence the representations of the rotation group, a prominent sym- 
metry of the Schrodinger equation, by studying electromagnetic transitions. 
However, there is one very important difference between the ‘weak multiplets’ 
we shall be considering, and those associated with symmetries which are not 
spontaneously broken. We saw in chapter 12 how an unbroken non-Abelian 
symmetry leads to multiplets of states which are degenerate in mass, but in 
section 17.1 we learned that that result only holds provided the vacuum is 
left invariant under the symmetry transformation. When the symmetry is 
spontaneously broken, the vacuum is not invariant, and we must expect that 
the degenerate multiplet structure will then, in general, disappear completely. 
This is precisely the situation in the electroweak theory. 

Nevertheless, as we shall see, essential consequences of the weak symme- 
try group — specifically, the relations it requires between otherwise unrelated 
masses and couplings — are accessible to experiment. Moreover, despite the 
fact that members of a multiplet of a global symmetry which is spontaneously 
broken will, in general, no longer have even approximately the same mass, 
the concept of a multiplet is still useful. This is because when the symmetry 
is made a local one, we shall find (in sections 22.2.2 and 22.2.3) that the as- 
sociated gauge quanta still mediate interactions between members of a given 
symmetry multiplet, just as in the manifest local non-Abelian symmetry ex- 
ample of QCD. Now, the leptonic transitions associated with the weak charged 
currents are, as we saw in chapter 20, v, + e, V, © u etc. This suggests that 
these pairs should be regarded as doublets under some group. Further we 
saw in section 20.7 how weak transitions involving charged quarks suggested 
a similar doublet structure for them also. The simplest possibility is there- 
fore to suppose that, in both cases, a ‘weak SU(2) group’ is involved, called 
‘weak isospin’. We emphasize once more that this weak isospin is distinct 
from the hadronic isospin of chapter 12, which is part of SU(3)r. We use the 
symbols t, t3 for the quantum numbers of weak isospin, and make the specific 
assignments for the leptonic fields 


pak t3 = 41/2 " D, DA 
E. ts = —1/2 pU AT A, $- 


II 
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where êr = $(1 — ys)ê etc, and for the quark fields 


hE De De ye 


As discussed in section 20.2.2, the subscript ‘L’ refers to the fact that only the 
left-handed chiral components of the fields enter, in consequence of the V-A 
structure. For this reason, the weak isospin group is referred to as SU(2)1, 
to show that the weak isospin assignments and corresponding transformation 
properties apply only to these left-handed parts. Notice that, as anticipated 
for a spontaneously broken symmetry, these doublets all involve pairs of parti- 
cles which are not mass degenerate. In (22.24) and (22.25), the primes indicate 
that these fields are related to the (unprimed) fields of definite mass by the 
unitary matrices U (for neutrinos) and V (for quarks), as discussed in sections 
21.4.1 and 20.7.3 respectively. 

Making this SU(2);, into a local phase invariance (following the logic of 
chapter 13) will entail the introduction of three gauge fields, transforming as 
a t = 1 multiplet (a triplet) under the group. Because (as with the ordi- 
nary SU(2); of hadronic isospin) the members of a weak isodoublet differ by 
one unit of charge, the two gauge fields associated with transitions between 
doublet members will have charge +1. The quanta of these fields will, of 
course, be the now familiar WF bosons mediating the charged current tran- 
sitions, and associated with the weak isospin raising and lowering operators 
t4. What about the third gauge boson of the triplet? This will be electrically 
neutral, and a very economical and appealing idea would be to associate this 
neutral vector particle with the photon, thereby unifying the weak and elec- 
tromagnetic interactions. A model of this kind was originally suggested by 
Schwinger (1957). Of course, the W's must somehow acquire mass, while the 
photon remains massless. Schwinger arranged this by introducing appropri- 
ate couplings of the vector bosons to additional scalar and pseudoscalar fields. 
'These couplings were arbitrary and no prediction of the W masses could be 
made. We now believe, following the arguments of the preceding section, 
that the W mass must arise via the spontaneous breakdown of a non-Abelian 
gauge symmetry, and as we saw in section 19.6, this does constrain the W 
mass. 

Apart from the question of the W mass in Schwinger's model, we now 
know (see chapter 20) that there exist neutral current weak interactions, in 
addition to those of the charged currents. We must also include these in our 
emerging gauge theory, and an obvious suggestion is to have these currents 
mediated by the neutral member W° of the SU(2) gauge field triplet. Such a 
scheme was indeed proposed by Bludman (1958), again pre-Higgs, so that W 
masses were put in ‘by hand’. In this model, however, the neutral currents will 
have the same pure left-handed V—A structure as the charged currents: but, 
as we saw in chapter 20, the neutral currents are not pure V-A. Furthermore, 
the attractive feature of including the photon, and thus unifying weak and 
electromagnetic interactions, has been lost. 
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A key contribution was made by Glashow (1961); similar ideas were also 
advanced by Salam and Ward (1964). Glashow suggested enlarging the 
Schwinger-Bludman SU(2) schemes by inclusion of an additional U(1) gauge 
group, resulting in an ‘SU(2), x U(1)’ group structure. The new Abelian U(1) 
group is associated with a weak analogue of hypercharge — ‘weak hypercharge’ 
- just as SU(2)r was associated with ‘weak isospin’. Indeed, Glashow pro- 
posed that the Gell-Mann-Nishijima relation for charges should also hold for 
these weak analogues, giving 


eQ = e(t3 + y/2) (22.26) 


for the electric charge Q (in units of e) of the t member of a weak isomulti- 
plet, assigned a weak hypercharge y. Clearly, therefore, the lepton doublets, 
(v4, e-), etc, then have y = —1, while the quark doublets (u,d’), etc, have 
y= +3. Now, when this group is gauged, everything falls marvellously into 
place: the charged vector bosons appear as before, but there are now two 
neutral vector bosons, which between them will be responsible for the weak 
neutral current processes, and for electromagnetism. This is exactly the piece 
of mathematics we went through in section 19.6, which we now appropriate 
as an important part of the Standard Model. 

For convenience, we reproduce here the main results of section 19.6. The 
Higgs field ¢ is an SU(2) doublet 


A ot 
$- So (22.27) 
with an assumed vacuum expectation value (in unitary gauge) given by 
(oldjo) = (7 (22.28) 
= v/V2 $ . 
Fluctuations about this value are parametrized in this gauge by 
b= j (22.29) 
B P (v 4- H) j 


where H is the (physical) Higgs field. The Lagrangian for the sector consisting 
of the gauge fields and the Higgs fields is 


Low = (Do) (D$) - n^otó- FOO - FB F 
where F^, is the SU(2) field strength tensor (19.80) for the gauge fields w" 


and G,,, is the U(1) field strength tensor (19.81) for the gauge field B", and 
D" is given by (19.79). After symmetry breaking (i.e. the insertion of (22.29) 
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in (22.30)) the quadratic parts of (22.30) can be written in unitary gauge as 
(see problem 19.9) 


A 1 A Z x 
Le. = 5 On HO" H — eH? 22.31) 
1 etos 2 z j 1 ae 
E ri — ô Wip) (O WY — YW!) + zI WW? 22.32) 
jore : A ; 1 e: 
- 4 (0. W», — 0,Wa,, ("Wy — Ə WY) + gd v Wou Wy 22.33) 
1 7 7 "a V ps 7 r7 
= 400.2, - 8,Z,) (0 Z^ — 0" Z^) + ri 9" )2,Z2" (22.34) 
Tex A 
= rm kt 22.35) 
where ^ . . 
Z" = cos0wW? — sin 0w B", 22.36) 
AY = sin&wW/ + cos Ow B"', 22.37) 
and M 
Fey — 6h AY — O” AP, 22.38) 
with 
cosOw =g/(g?+9")'/?, — sindw = g'/(g? + g). (22.39) 


Feynman rules for the vector boson propagators (in unitary gauge) and cou- 
plings, and for the Higgs couplings, can be read off from (22.30), and are given 
in appendix Q. 

Equations (22.31)-(22.35) give the tree-level masses of the Higgs boson 
and the gauge bosons: (22.31) tells us that the mass of the Higgs boson is 


mg = V2 = VAv/ v2, (22.40) 


where v/4/2 is the (tree-level) Higgs vacuum value; (22.32) and (22.33) show 
that the charged W’s have a mass 


Mw = gv/2 (22.41) 


where g is the SU(2) gauge coupling constant; (22.34) gives the mass of the 
Z° as 
Mz = Mw/ cos Ow (22.42) 


and (22.35) shows that the A“ field describes a massless particle (to be iden- 
tified with the photon). 

Still unaccounted for are the right-handed chiral components of the fermion 
fields. There is at present no evidence for any weak interactions coupling to 
the right-handed field components, and it is therefore natural — and a basic 
assumption of the electroweak theory — that all ‘R’ components are singlets 
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TABLE 22.1 
Weak isospin and hypercharge assignments. 


t t3 y Q 
0 
0 


VoL» Vp: VrL 12 12 4 


Vea Vue VR O 0 0 

eL, ML, TL 1/2 -1/2 -1 -1 

€R, HR; TR 0 0 -2 -l 

UL, CL, tL 1/2 1/2 1/3 2/3 
UR, CR, tr 0 0 4/3 2/3 
dst, b 1/2 -1/2 1/3 -1/3 
dz, sk, b 0 0 -2/3 -1/3 
pt 1/2 1⁄2 1 1l 

o? 1/2 -1/2 1 0 


under the weak isospin group. Crucially, however, the ‘R? components do 
interact via the U(1) field B"; it is this that allows electromagnetism to emerge 
free of parity-violating ys; terms, as we shall see. With the help of the weak 
charge formula (equation (22.26)), we arrive at the assignments shown in table 
22.1. 

We have included ‘R’ components for the neutrinos in the table. It is, 
however, fair to say that in the original Standard Model the neutrinos were 
taken to be massless, with no neutrino mixing. We have seen in chapter 20 
that it is for many purposes an excellent approximation to treat the neutrinos 
as massless, except when discussing neutrino oscillations. We shall mention 
their masses again in section 22.5.2, but for the moment we proceed in the 
‘massless neutrinos’ approximation. In this case, there are no ‘R’ components 
for neutrinos, and no neutrino mixing. 

We can now proceed to write down the currents of the electroweak theory. 
We will show that these dynamical symmetry currents are precisely the same 
as the phenomenological currents of the current-current model developed in 
chapter 20. The new feature here is that — as in the electromagnetic case — 
the currents interact with each other by the exchange of a gauge boson, rather 
than directly. 


22.2.2 The leptonic currents (massless neutrinos): relation 
to current-current model 


We write the SU(2)L xU(1) covariant derivative, in terms of the fields w^" 
and B" of section 19.6, as 


D" = 0" +igr -W" /2+ig’yB"/2 — on'L' SU(2) doublets — (22.43 
g 


and as 
D" = 0" +ig'yB"/2 on ‘R’ SU(2) singlets. (22.44) 
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The leptonic couplings to the gauge fields therefore arise from the ‘gauge- 
covariantized’ free leptonic Lagrangian: 


Liew e 5 itil pe + 5 ipriDipr, (22.45) 


f=e,H,7 f-eu,T 


where the 1 fr are the left-handed doublets 


lg, = ( K p (22.46) 


and lm are the singlets la = = êr etc. 

Consider first the charged leptonic currents. The correct normalization for 
the charged fields is that W^ = (W^ — iW/)/4/2 destroys the W+ or creates 
the W- (cf (7.15)). The ‘r - W /2' terms can be written as 


R 1 qe EAD. Tye sya . 
Wham y fa ED EEN e quam 


where r4 = (Tı £i72)/2 are the usual raising and lowering operators for the 
doublets. Thus the ‘f=e’ contribution to the first term in (22.45) picks out 
the process e^ — Ve + W- for example, with the result that the corresponding 
vertex is given by 


ig _„(1-— 7s) 
-=q ——_. 22.48 
A 3 (22.48) 
The ‘universality’ of the single coupling constant ‘g’ ensures that (22.48) is 
also the amplitude for the u — v, — W and r — v, — W vertices. Thus the 
amplitude for the v, -F e^ — u^ + Ve process considered in section 20.8 is 


Cose zuo EHE IMS [ira Eae 
(22.49) 


corresponding to the Feynman graph of figure 22.10. 
For k? < Mé; we can replace the W-propagator by the constant value 
gt” / M$&, leading to the amplitude 


+2 
ig^ _ _ 
sae UH) Yl — vs)u(vu)u(ve)Y" (1 — Y5)u(e), (22.50) 
8M, 
which may be compared with the form we used in the current-current theory, 
equation (20.50). This comparison gives 
Gg. 9 
V2 8M3 


This is an important equation, giving the precise version, in the GSW theory, 


(22.51) 
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v H 


FIGURE 22.10 
W-exchange process inv, +e  — u^ c ve. 


of the qualitative relation g?/M?, ~ Gy introduced following equation (22.20), 
and in volume 1, at equation (1.32). 
Putting together (22.41) and (22.51) we can deduce 


Gp /V2 = 1/(2v?) (22.52) 
so that from the known value (22.4) of Gp there follows the value of v: 
v c 246 GeV. (22.53) 
Alternatively we may quote v/4/2 (the vacuum value of the Higgs field): 
v/ V2 ~ 174 GeV. (22.54) 


'This parameter sets the scale of electroweak symmetry breaking, but as yet 
no theory is able to predict its value. It is related to the parameters A, u of 
(22.30) by v/4/2 = /2u/ X7? (cf (17.98)). 

In general, the charge-changing part of (22.45) can be written as 


GEN emissus bs ace car besides 
Mertens secat 
-Fhermitian conjugate, (22.55) 


where W^ = (WP —1W4)/4/2. (22.55) has the form 
—J6c (leptons)W,, — jët (leptons)W7 (22.56) 


where the leptonic weak charged current JEg (leptons) is precisely that used in 
the current-current model (equation (20.38)), up to the usual factors of g's and 
Vs. Thus the dynamical symmetry currents of the SU(2),, gauge theory are 
exactly the ‘phenomenological’ currents of the earlier current-current model. 
The Feynman rules for the lepton-W couplings (appendix Q) can be read off 
from (22.55). 
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Turning now to the leptonic weak neutral current, this will appear via the 
couplings to the Z°, written as 


—jNc (leptons) Z,,. (22.57) 


Referring to (22.36) for the linear combination of W# and B which represents 
Z", we find (problem 22.4) 


jXq(leptons) = mn E a 2) — sin? wQ: di, (22.58) 


where the sum is over the six lepton fields %,e7,v,,...7 . For the Q = 0 
neutrinos with t3 = +3, 


e ; g ode). 
JNc(neutrinos) = prepa 2 Z — À, (22.59) 


where now | = e,u,T. For the other (negatively charged) leptons, we shall 
have both L and R couplings from (22.58), and we can write 


1 " 
j&c (charged leptons) — P» iy) p (- = +c ( 23] l, 
2 
(22.60) 
where 
1 

d = th—sin? @wQ:= aoe sin? Ow (22.61) 
ck =  —sin?0wQ,; = sin? Ow. (22.62) 


As noted earlier, the Z? coupling is not pure ‘V-A’. These relations (22.59)- 
(22.62) are exactly the ones given earlier, in (20.85)-(20.87); in particular, 
the couplings are independent of ‘l’ and hence exhibit lepton universality. 
The alternative notation 


jKc (charged leptons) = Icos Tees mL by" (gl, — gys)l (22.63) 


is often used, where 


gi a + 2sin? Aw gi, = “5, independent of l. (22.64) 
Note that the gy vanishes for sin? y = 0.25. Again, the Feynman rules for 
lepton-Z couplings (appendix Q) are contained in (22.59) and (22.60). 

As in the case of W-mediated charge-charging processes, Z°-mediated pro- 
cesses reduce to the current-current form at low k?. For example, the ampli- 
tude for e^ u^ — e^ u^ via Z? exchange (figure 22.11) reduces to 

ig? 


~ cos? Ow M2 ü(e)yu [eL (1 — vs) + eR + 98)]u(e)u(u)" 


x [eL — 09) + ek uu). (22.65) 
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FIGURE 22.11 
Z°-exchange process ine u^ — e pu. . 


It is customary to define the parameter 
p = M&/(M3 cos? Ow), (22.66) 


which is unity at tree-level, in the absence of loop corrections. The ratio of 
factors in front of the @...u expressions in (22.65) and (22.50) (i.e. ‘neutral 
current process’ /'charged current process’) is then 2p. 

We may also check the electromagnetic current in the theory, by looking 
for the piece that couples to A". We find 


diae = =g sin Ow 5 yl (22.67) 


l=e,p,7 
which allows us to identify the electromagnetic charge e as 
e = gsinbw (22.68) 
as already suggested in (19.97) of chapter 19. Note that all the y5’s cancel 


from (22.67), as is of course required. 


22.2.8 The quark currents 


The charge-changing quark currents, which are coupled to the W- fields, have 
a form very similar to that of the charged leptonic currents, except that the 
t3 = -i components of the L-doublets have to be understood as the flavour- 
mixed (weakly interacting) states 


d' Vaa Vas Vab d 
8 — | Vea Ves Veb $ 3 (22.69) 
v Ji Va Vis Vib b Ja 


where d, 8 and b are the strongly interacting fields with masses ma, Ms and 
my, and the V-matrix is the CKM matrix used extensively in chapter 21. We 
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shall discuss this matrix further in section 22.5.2. Thus the charge-changing 
weak quark current is 


s g. [oce] oo eg as) ger mop be gs 
Jec(quarks) = Vi [erm + NE ee + ty ee ; 


2 
(22.70) 
which generalizes (20.90) to three generations and supplies the factor g/V2, 
as for the leptons. 

The neutral currents are diagonal in flavour if the matrix V is unitary (see 
also section 22.5.2). Thus om (quarks) will be given by the same expression 
as (20.103), except that now the sum will be over all six quark flavours. The 
neutral weak quark current is thus 


" g = (1 — 4s) (Lys): | 
jc (quarks) = 5 » à" UE + oR rH (2271) 
where 
d = t—-sin? wQ (22.72) 
ch = —sin? wQ. (22.73) 


These expressions are exactly as given in (20.103)-(20.105). As for the charged 
leptons, we can alternatively write (22.71) as 


^ g ES x 
JNo(quarks) = Feosy. » ay" (9 — 94.5) 4; (22.74) 
where 
gp = th—2sin? OwQ, (22.75) 
gà = t (22.76) 


Before proceeding to discuss some simple phenomenological consequences, 
we remind the reader of one important feature of the Standard Model currents 
in general. Reading (22.24) and (22.25) together ‘vertically’, the leptons and 
quarks are grouped in three generations, each with two leptons and two quarks. 
The theoretical motivation for such family grouping is that anomalies are 
cancelled within each complete generation, as discussed in section 18.4. 


E 


22.3 Simple (tree-level) predictions 


The theory as so far developed has just 4 parameters: the gauge couplings g 
and g', and the parameters À and u of the Higgs potential. The previous two 
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subsections show that all the couplings to fermions can be written in terms 
of the known quantities Gr and e (or a), and one free parameter which may 
be taken to be sin 0w. We noted in section 20.9 that, before the discovery of 
the W and Z particles, the then known neutrino data were consistent with a 
single value of Ow given by sin? Ow ~ 0.23. Using (22.51) and (22.68), it was 
then possible to predict the value of Mw: 


1/2 

1 2 

Mw = (=) LI. po eres Cay. (22.77) 
J/2Gr sinÜw sinw 


Similarly, using (22.42) we predict 
Mz = Mw/ cos Ow ~ 88.58 GeV. (22.78) 


These predictions of the theory (at lowest order) indicate the power of the 
underlying symmetry to tie together many apparently unrelated quantities, 
which are all determined in terms of only a few basic parameters. We now 
present a number of other simple tree-level predictions. 
The width for W^ — e^ + De can be calculated using the vertex (22.48), 
with the result (problem 22.5) 
2 
pe SSeS E Ge IR ~ 205 MeV, (22.79) 
using (22.77). The widths to w~v,,7 V, are the same. Neglecting CKM 
flavour mixing among the two energetically allowed quark channels tid and cs, 
their widths would also be the same, apart from a factor of 3 for the different 
colour channels. The total W width for all these channels will therefore be 
about nine times the value in (22.79), i.e. 1.85 GeV, while the branching ratio 
for W > ev is 
B(ev) =T(W > ev)/T (total) ~ 11%. (22.80) 


In making these estimates we have neglected all fermion masses. 
The width for Z? — v? can be found from (22.79) by replacing g/2!/? by 
g/2cosOw, and Mw by Mz, giving 


_ 1g Mz Gr M. 
[(Z° + vy) = ———— - Cr Mz 159 Mev, 22.81 

( W) 54 ip oe Oe 272 Tae T 2251) 
using (22.78). Charged lepton pairs couple with both c}, and cl terms, leading 
(with neglect of lepton masses) to 


l2 l2 2 
A + M 
T(z? il) (= A cn | ) g Z 


Ar cos? Ow 


(22.82) 


The values cf, = 3, c = 0 in (22.82) reproduce (22.81). With sin? Ow ~ 0.23, 
we find 7 
T(Z° — Ul) ~ 76.5 MeV. (22.83) 
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FIGURE 22.12 
Neutrino-electron graphs involving Z° exchange. 


Quark pairs couple as in (22.71), the GIM mechanism ensuring that all flavour- 
changing terms cancel. The total width to uü,dd,cé,s$ and bb channels 
(allowing 3 for colour and neglecting masses) is then 1538 MeV, produc- 
ing an estimated total width of approximately 2.22 GeV. (QCD corrections 
will increase these estimates by a factor of order 1.1). The branching ratio 
to charged leptons is approximately 3.4%, to the three (invisible) neutrino 
channels 20.5%, and to hadrons (via hadronization of the qq channels) about 
69.3%. In section 22.4.3 we shall see how a precise measurement of the total 
Z° width at LEP determined the number of light neutrinos to be 3. 

Cross sections for neutrino-lepton scattering proceeding via Z° exchange 
can be calculated (for k? « M32) using the currents (22.59) and (22.60), and 
the method of section 20.5. Examples are 


Vue — vae (22.84) 


and 
Dye — Dye (22.85) 


as shown in figure 22.12. Since the neutral current for the electron is not pure 
V-A, as was the charged current, we expect to see terms involving both |c} |? 
and |ch]|?, and possibly an interference term. The cross section for (22.84) is 
found to be (’t Hooft 1971c) 


de/dy = (202 Brme/) ||P + (kP — y)? — S (e d, + d ch)yrme/ El 
(22.86) 
where E is the energy of the incident neutrino in the ‘laboratory’ system, and 
y = (E — E’)/E as before, where E’ is the energy of the outgoing neutrino in 
the ‘laboratory’ system”. Equation (22.86) may be compared with the v,e~ > 
IL Ve (charged current) cross section of (20.84) by noting that t = —2m&Ey: 
the |c} |? term agrees with the pure V-A result (20.84), while the |c |? term 


2In the kinematics, lepton masses have been neglected wherever possible. 
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FIGURE 22.13 
One-W annihilation graph in %e~ — Pe. 


involves the same (1 — y)? factor discussed for vq scattering in section 20.7.2. 
The interference term is negligible for E > me. The cross section for the 
antineutrino process (22.85) is found from (22.86) by interchanging cf, and 
l 
CR- 

A third neutrino-lepton process is experimentally available, 


De — We, (22.87) 


the cross section for which was measured by Reines, Gurr and Sobel (1976), 
using electron antineutrinos from an 1800-MW fission reactor at Savannah 
River. In this case there is a single W intermediate state graph, shown in 
figure 22.13, to consider as well as the Z? one; the latter is similar to the right- 
hand graph in figure 22.12, but with v,, replaced by Pe. The cross section for 
(22.87) turns out to be given by an expression of the form (22.86), but with 
the replacements 


1 
d at sin? Ow, ck — sin? Ow. (22.88) 


Reines, Gurr and Sobel reported the result sin? 0w = 0.29 + 0.05. 

We emphasize once more that all these cross sections are determined in 
terms of Gp, a and only one further parameter, sin? 0w. As mentioned in 
section 20.9, experimental fits to these predictions are reviewed by Commins 
and Bucksbaum (1983), Renton (1990) and Winter (2000). 

Particularly precise determinations of the Standard Model parameters 
were made at the ete” colliders, LEP and SLC. Consider the reaction e* e^ — 
ff where f is p or 7, at energies where the lepton masses may be neglected in 
the final answers. In lowest order, the process is mediated by both y-exchange 
and Z°-exchange as shown in figure 22.14. Calculations of the cross section 
were made early on, by Budny (1973) for example. In modern notation, the 
differential cross section for the scattering of unpolarized e^ and e* is given 
by 

do Ta? 


— —À = —— | + cos? 0)A + cos 0B] (22.89) 
dcos 0 2s 
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(a) (b) 


FIGURE 22.14 E 
(a) One-y and (b) one-W annihilation graphs in e*e^ — ff. 


where @ is the CM scattering angle of the final state lepton, s = (p.- + pe«)?, 
and 


= 1-2999(Rex(s) + (95)? + (SAPIA)? + GU?]IxG)P? (22.90) 
= Agi ghRex(s) + 895 9 0A 9v Ix CD" (22.91) 
x(s) = s/[4sin? Aw cos? 0w (s — Mz + iPz Mz)]. (22.92) 


Notice that the term surviving when all the g's are set to zero, which is there- 
fore the pure single photon contribution, is exactly as calculated in problem 
8.18. The presence of the cos @ term leads to the forward-backward asymme- 
try noted in that problem. 

The forward-backward asymmetry Arp may be defined as 


App = (Nf = Np)/(Ne + Np), (22.93) 


where Ng is the number scattered into the forward hemisphere 0 < cos@ < 1, 
and Np that into the backward hemisphere —1 < cos0 < 0. Integrating 
(22.89) one easily finds 

App = 3B/8A. (22.94) 


For sin? Ow = 0.25 we noted after (22.64) that the gi,’s vanish, so they are 
very small for sin? 0w œ 0.23. The effect is therefore controlled essentially by 
the first term in (22.91). At ys = 29 GeV, for example, the asymmetry is 
App ~ —0.063. 

This asymmetry was observed in experiments with PETRA at DESY and 
with PEP at SLAC (see figure 8.20(b)). These measurements, made at en- 
ergies well below the Z° peak, were the first indication of the presence of Z° 
exchange in e*e7 collisions. 
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However, QED alone produces a small positive App, through interference 
between 1y and 2y annihilation processes (which have different charge conju- 
gation parity), as well as between initial and final state bremsstrahlung cor- 
rections to figure 22.14(a). Indeed, all one-loop radiative effects must clearly 
be considered, in any comparison with modern high precision data. 

At the CERN ete™~ collider LEP, many such measurements were made 
‘on the Z peak’, i.e. at s = M2 in the parametrization (22.92). In that case, 
Rey(s) = 0, and (22.94) becomes (neglecting the photon contribution) 


395 05.910 
A ZÜ peak) = ————— —P2A2V9A9V —— ^ ———. 22.95 
m peak) = qu 9 Plo)? + ED TM 


Another important asymmetry observable is that involving the difference 
of the cross sections for left- and right-handed incident electrons: 


Arr = (oL — oR)/(oL + en). (22.96) 
for which the tree-level prediction is 
Arn = 29S ga / (oV) + 94)" I- (22.97) 


A similar combination of the g's for the final state leptons can be measured 
by forming the ‘L-R F-B' asymmetry 


AFR = [(ezr — exa) — (cnr — oRB)|/(oR + ex) (22.98) 


for which the tree level prediction is 


AER = 29 gf / G1? + (9A)? I. (22.99) 
The quantity on the right-hand side of (22.99) is usually denoted by Ay: 
Ay = 2o 0A / Co" + (GA)? (22.100) 


The asymmetry Arg is not, in fact, direct evidence for parity violation in 
ete” — utu”, since we see from (22.90) and (22.91) that it is even under 
gh + —g, whereas a true parity-violating effect would involve terms odd 
(linear) in gh. However, electroweak-induced parity violation effects in an 
apparently electromagnetic process were observed in a remarkable experiment 
by Prescott et al. (1978). Longitudinally polarized electrons were inelastically 
scattered from deuterium, and the flux of scattered electrons was measured 
for incident electrons of definite helicity. An asymmetry between the results, 
depending on the helicities, was observed — a clear signal for parity violation. 
This was the first demonstration of parity-violating effects in an ‘electromag- 
netic’ process; the corresponding value of sin? 8w was in agreement with that 
determined from v data. 

We now turn to some of the main experimental evidence, beginning with 
the discoveries of the W^ and Z° 1983. 
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Lepton Pair 


FIGURE 22.15 
Parton model amplitude for WF or Z? production in pp collisions. 


E 


22.4 The discovery of the W^ and Z° at the CERN pp 
collider 


22.4.1 Production cross sections for W and Z in pp colliders 


The possibility of producing the predicted WË and Z^ particles was the prin- 
cipal motivation for transforming the CERN SPS into a pp collider using the 
stochastic cooling technique (Rubbia et al. 1977, Staff of the CERN pp project 
1981). Estimates of W and Z? production in pp collisions may be obtained 
(see, for example, Quigg 1977) from the parton model, in a way analogous to 
that used for the Drell-Yan process in section 9.4 with y replaced by W or 
Z°, as shown in figure 22.15 (cf figure 9.11), and for two-jet cross sections in 
section 14.3.2. As in (14.51), we denote by § the subprocess invariant 


8 = (vipi + t2p2)? = T1225 (22.101) 


for massless partons. With $!/2 = Mw ~ 80 GeV, and s'/? =630 GeV for 
the pp collider energy, we see that the z's are typically —0.13, so that the 
valence q’s in the proton and q's in the antiproton will dominate (at J/s = 1.8 
TeV, appropriate to the Fermilab Tevatron, x ~ 0.04 and the sea quarks 
contribute). The parton model cross section pp — W--4- anything is then 
(setting Via = 1 and all other Vi; = 0) 


5o W£axy Sl f as f dmole,, d dra) + derula) 
c(ppo WF +X) = sf d if dz36 (21, 21 mee) cina | 


where the i is the same colour factor as in the Drell-Yan process, and the 


subprocess cross section & for qq + W= + X is (neglecting the W- width) 


ô 4r°a(1/4sin? Ow)d5(§ — My) (22.103) 
227 Gp M25 (21228 — Me). (22.104) 
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QCD corrections to (22.102) must as usual be included. Leading loga- 
rithms will make the distributions Q?-dependent, and they should be evalu- 
ated at Q? = M$,. There will be further (O(o2)) corrections, which are often 
accounted for by a multiplicative factor ‘K’, which is of order 1.5-2 at these 
energies. O(a?) calculations are presented in Hamberg et al. (1991) and by 
van der Neerven and Zijlstra (1992); see also Ellis et al. (1996) section 9.4. 
The total cross section for production of W* and W7 at ys =630 GeV is 
then of order 6.5 nb, while a similar calculation for the Z? gives about 2 nb. 
Multiplying these by the branching ratios gives 


o(pp +> W--X-evX) ~ 0.7 nb (22.105) 


o(pp — Z° +X > ete X) 


R 


0.07 nb (22.106) 


at V/s =630 GeV. 


'The total cross section for pp is about 70 mb at these energies: hence 
(22.105) represents ~ 1078 of the total cross section, and (22.106) is 10 times 
smaller. The rates could, of course, be increased by using the qq modes of 
W and Z?, which have bigger branching ratios. But the detection of these is 
very difficult, being very hard to distinguish from conventional two-jet events 
produced via the mechanism discussed in section 14.3.2, which has a cross 
section some 10? higher than (22.105). W and Z^ would appear as slight 
shoulders on the edge of a very steeply falling invariant mass distribution, 
similar to that shown in figure 9.12, and the calorimetric jet energy resolution 
capable of resolving such an effect is hard to achieve. Thus despite the un- 
favourable branching ratios, the leptonic modes provide the better signatures, 
as discussed further in section 22.4.3. 


22.4.2 Charge asymmetry in W- decay 


At energies such that the simple valence quark picture of (22.102) is valid, the 
WF is created in the annihilation of a left-handed u quark from the proton 
and a right-handed d quark from the p (neglecting fermion masses). In the 
WF — ety, decay, a right-handed e* and left-handed 1% are emitted. Refer- 
ring to figure 22.16, we see that angular momentum conservation allows e* 
production parallel to the direction of the antiproton, but forbids it parallel 
to the direction of the proton. Similarly, in W^ — ei, the e^ is emitted 
preferentially parallel to the proton (these considerations are exactly similar 
to those mentioned in section 20.7.2 with reference to vq and vq scattering). 
The actual distribution has the form ~ (1 + cos 0ž)?, where 07 is the angle, in 
the rest frame of the W, between the e^ and the p (for W- — e x) or the 
et and the p (for Wt — et). 
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FIGURE 22.16 
Preferred direction of leptons in WT decay. 


22.4.3 Discovery of the W- and Z° at the pp collider, and 
their properties 


As already indicated in section 22.4.1, the best signatures for W and Z pro- 
duction in pp collisions are provided by the leptonic modes 


pp > WFX > e*vX (22.107) 
pp — Z°X 2 ete" X. (22.108) 


Reaction (22.107) has the larger cross section, by a factor of 10 (cf (22.105) and 
(22.106)), and was observed first (UA1, Arnison et al. 1983a; UA2, Banner 
et al. 1983). However, the kinematics of (22.108) is simpler and so the Z? 
discovery (UA1, Arnison et al. (1983b); UA2, Bagnaia et al. 1983) will be 
discussed first. 

The signature for (22.108) is an isolated, and approximately back-to-back, 
ete” pair with invariant mass peaked around 90 GeV (cf (22.78)). Very clean 
events can be isolated by imposing a modest transverse energy cut — the e*e^ 
pairs required are coming from the decay of a massive relatively slowly moving 
Z?. Figure 22.17 shows the transverse energy distribution of a candidate Z° 
event from the first UA2 sample. Figure 22.18 shows (Geer 1986) the invariant 
mass distribution for a later sample of 14 UAI events in which both electrons 
have well measured energies, together with the Breit-Wigner resonance curve 
appropriate to Mz — 93 GeV/c?, with experimental mass resolution folded 
in. The UAI result for the Z? mass was 


Mz = 93.0 + 1.4(stat) + 3.2(syst.) GeV. (22.109) 


The corresponding UA2 result (DiLella 1986), based on 13 well measured 
pairs, was 

Mz = 92.5 + 1.3(stat.) + 1.5(syst.) GeV. (22.110) 
In both cases the systematic error reflects the uncertainty in the absolute 
calibration of the calorimeter energy scale. Clearly the agreement with (22.78) 
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FIGURE 22.17 
The cell transverse energy distribution for a Z? > ete~ event (UA2, Bagnaia 


et al. 1983) in the 0 and ¢ plane, where 0 and $ are the polar and azimuth 
angles relative to the beam axis. 
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FIGURE 22.18 

Invariant mass distribution for 14 well measured Z° — e*e^ decays (UAI). 
Figure reprinted with permission from S Geer in High Energy Physics 1985, 
Proc. Yale Theoretical Advanced Study Institute, eds M J Bowick and F 
Gursey; copyright 1986 World Scientific Publishing Company. 
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is good, but there is a suggestion that the tree-level prediction is on the low 
side. Indeed, loop corrections adjust (22.78) to a value Mi ~ 91.19 GeV, 
in excellent agreement with the current experimental value (Nakamura et al. 
2010). 

The total Z? width Iz is an interesting quantity. If we assume that, for 
any fermion family additional to the three known ones, only the neutrinos are 
significantly less massive than Mz/2, we have 


Iz ~ (2.5 + 0.16AN,) GeV (22.111) 


from section 22.3, where AN, is the number of additional light neutrinos 
(i.e. beyond ve, v,, and v.) which contribute to the width through the process 
Z° — vv. Thus (22.111) can be used as an important measure of the number 
of such neutrinos (i.e. generations) if Tz can be determined accurately enough. 
The mass resolution of the pp experiments was of the same order as the total 
expected Z^ width, so that (22.111) could not be used directly. The advent 
of LEP provided precision checks on (22.111); at the cost of departing from 
the historical development, we show data from DELPHI (Abreu et al. 1990, 
Abe 1991) in figure 22.19, which established N, = 3. 

We turn now to the WS. In this case an invariant mass plot is impossi- 
ble, since we are looking for the ev (uv) mode, and cannot measure the v's. 
However, it is clear that — as in the case of Z? + ete~ decay — slow moving 
massive W's will emit isolated electrons with high transverse energy. Further, 
such electrons should be produced in association with large missing transverse 
energy (corresponding to the v’s), which can be measured by calorimetry, and 
which should balance the transverse energy of the electrons. Thus electrons of 
high Er accompanied by balancing high missing Er (i.e. similar in magnitude 
to that of the e^ but opposite in azimuth) were the signatures used for the 
early event samples (UA1, Arnison et al. 1983a; UA2, Banner et al. 1983). 

'The determination of the mass of the W is not quite so straightforward as 
that of the Z, since we cannot construct directly an invariant mass plot for the 
ev pair: only the missing transverse momentum (or energy) can be attributed 
to the v, since some unidentified longitudinal momentum will always be lost 
down the beam pipe. In fact, the distribution of events in per, the magnitude 
of the transverse momentum of the e^, should show a pronounced peaking 
towards the maximum kinematically allowed value, which is pep % iMw, as 
may be seen from the following argument. Consider the decay of a W at rest 
(figure 22.20). We have |p,| = 3 Mw and [p,r| = $ Mw sin0 = per. Thus the 
transverse momentum distribution is given by 


de 2DeT 1 2 2 Ae 
= = — 22.112 
d cos 0 (3) E NUR » ) 


and the last (Jacobian) factor in (22.112) produces a strong peaking towards 
PeT = iMw. This peaking will be smeared by the width, and transverse 
motion, of the W. Early determinations of Mw used (22.112), but sensitivity 


dc — do _ |dcosé 
dper dcos | dper 
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FIGURE 22.19 


The cross-section for e*e^ — hadrons around the Z? mass (DELPHI, 1990). 
The dotted, continuous and dashed lines are the predictions of the Standard 
Model assuming two, three and four massless neutrino species respectively. 
Figure reprinted with permission from K Abe in Proc. 25th Int. Conf. on 


High Energy Physics eds K K Phua and Y Yamaguchi; copyright 1991 World 
Scientific Publishing Company. 
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FIGURE 22.20 
Kinematics of W — ev decay. 
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FIGURE 22.21 

W — ev transverse mass distribution measured by the CDF collaboration. 
Figure reprinted with permission from F Abe et al. (CDF Collaboration) 
Phys. Rev. D 52 4784 (1995). Copyright 1995 by the American Physical 
Society. 


to the transverse momentum of the W can be much reduced (Barger et al. 
1983) by considering instead the distribution in ‘transverse mass’, defined by 


Mi = (Eer4 Ey), (Per 3 Pur) c: 2perpvr(1 — cos $), (22.113) 


where ¢ is the azimuthal separation between per and pyr. Here Eyr and pr 
are the neutrino transverse energy and momentum, measured from the missing 
transverse energy and momentum obtained from the global event reconstruc- 
tion. This inclusion of additional measured quantities improves the precision 
as compared with the Jacobian peak method, using (22.112). A Monte Carlo 
simulation was used to generate Mr distributions for different values of Mw, 
and the most probable value was found by a maximum likelihood fit. The 
quoted results were 


UAI (Geer 1986): Mw = 83.5 Li (stat.) + 2.8(syst.) GeV (22.114) 
UA2 (DiLella 1986): Mw = 81.2+1.1(stat.) + 1.3(syst.) GeV (22.115) 


the systematic errors again reflecting uncertainty in the absolute energy scale 
of the calorimeters. The two experiments also quoted (Geer 1986, DiLella 
1986) 

UAI [Tw «6.5 GeV 


UA2 DW <70 GeV } sox c.l. (22.116) 


Once again, the agreement between the experiments, and of both with (22.77), 
is good, the predictions again being on the low side. Loop corrections adjust 
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FIGURE 22.22 

The W decay angular distribution of the emission angle 07 of the positron 
(electron) with respect to the antiproton (proton) beam direction, in the rest 
frame of the W, for a total of 75 events; background subtracted and acceptance 
corrected (Arnison et al. 1986). 


(22.77) to Mw ~ 80.38 GeV (Nakamura et al. 2010). We show in figure 22.21 
a later determination of Mw by the CDF collaboration (Abe et al. 1995a). 
The W and Z mass values may be used together with (22.42) to obtain 
sin? Ow via 
sin? Ow = 1 — M&/M3. (22.117) 


The weighted average of UA(1) and UA(2) yielded 


sin? Oy = 0.212 + 0.022 (stat.). (22.118) 


Radiative corrections have in general to be applied, but one renormalization 
scheme (see section 22.6) promotes (22.117) to a definition of the renormalized 
sin? Ow to all orders in perturbation theory. Using this scheme and quoted 
values of Mw, Mz (Nakamura et al. 2010) one finds sin? 0w ~ 0.223. 

Finally, figure 22.22 shows (Arnison et al. 1986) the angular distribution of 
the charged lepton in W — ev decay (see section 22.4.2); 0* is the et (e7) angle 
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in the W rest frame, measured with respect to a direction parallel (antiparallel) 
to the p(p) beam. The expected form (1 + cos 6%)? is followed very closely. 
In summary, we may say that the early discovery experiments provided 
remarkably convincing confirmation of the principal expectations of the GSW 
theory, as outlined in section 22.3. 
We now consider some further aspects of the theory. 


E e € — 
22.5 Fermion masses 
22.5.1 One generation 


The fact that the SU(2), gauge group acts only on the L components of the 
fermion fields immediately appears to create a fundamental problem as far 
as the masses of these particles are concerned; we mentioned this briefly at 
the end of section 19.6. Let us recall first that the standard way to introduce 
the interactions of gauge fields with matter fields (e.g. fermions) is via the 
covariant derivative replacement 


8" 5, D" = 9" + igr- W"/2 (22.119) 


for SU(2) fields W" acting on t — 1/2 doublets. Now it is a simple exercise 
(compare problem 18.3) to check that the ordinary ‘kinetic’ part of a free 
Dirac fermion does not mix the L and R components of the field: 


à Ob = dg Pon +b, Qux. (22.120) 


Thus we can in principle contemplate ‘gauging’ the L and the R components 
differently. Of course, in the case of QCD (cf (18.39)) the replacement 7 -~D 
was made equally in each term on the right-hand side of (22.120). But this was 
because QCD conserves parity, and must therefore treat L and R components 
the same. Weak interactions are parity violating, and the SU(2)r, covariant 
derivative acts only in the second term of (22.120). On the other hand, a 
Dirac mass term has the form 


-m(y ín t+ debe) (22.121) 


(see equation (18.41) for example), and it precisely couples the L and R com- 
ponents. It is easy to see that if only i is subject to a transformation, then 
(22.121) is not invariant. Thus mass terms for Dirac fermions will explicitly 
break SU(2);. The same is also true for Majorana fermions which might 
describe the neutrinos. 

This kind of explicit breaking of the gauge symmetry cannot be tolerated, 
in the sense that it will lead, once again, to violations of unitarity, and then of 
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FIGURE 22.23 f 
One-Z? and one-y annihilation contribution to fa=-1 fa=1 > Wd Wy. 


renormalizability. Consider, for example, a fermion-antifermion annihilation 
process of the form a 
FE > WEW, (22.122) 


where the subscript indicates the \ = 0 (longitudinal) polarization state of the 
W=. We studied such a reaction in section 22.1.1 in the context of unitarity vi- 
olations (in lowest-order perturbation theory) for the IVB model. Appelquist 
and Chanowitz (1987) considered first the case in which ‘f’ is a lepton with 
t = 4,t3 = —$ coupling to W's, Z° and y with the usual SU(2);, x U(1) 
couplings, but having an explicit (Dirac) mass my. They found that in the 
‘right’ helicity channels for the leptons (A = +1 for f, A = —1 for f) the 
bad high energy behaviour associated with a fermion-exchange diagram of 
the form of figure 22.4 was cancelled by that of the diagrams shown in figure 
22.23. The sum of the amplitudes tends to a constant as s (or E?) — oo. 
Such cancellations are a feature of gauge theories, as we indicated at the end 
of section 22.1.2, and represent one aspect of the renormalizability of the the- 
ory. But suppose, following Appelquist and Chanowitz (1987), we examine 
channels involving the ‘wrong’ helicity component, for example A = +1 for 
the fermion f. Then it is found that the cancellation no longer occurs, and we 
shall ultimately have a ‘non-renormalizable’ problem on our hands, all over 
again. 

An estimate of the energy at which this will happen can be made by 
recalling that the ‘wrong’ helicity state participates only by virtue of a factor 
(my/energy) (recall section 20.2.2), which here we can take to be my/ s. 
The typical bad high energy behaviour for an amplitude M was M ~ Gps, 
which we expect to be modified here to 


The estimate obtained by Appelquist and Chanowitz differs only by a factor of 
V2. Attending to all the factors in the partial wave expansion gives the result 


22.5. Fermion masses 403 


that the unitarity bound will be saturated at E = E; (TeV) ~ w/m, (TeV). 
Thus for m, ~ 175 GeV, E, ~ 18 TeV. This would constitute a serious flaw 
in the theory, even though the breakdown occurs at energies beyond those 
currently reachable. 

However, in a theory with spontaneous symmetry breaking, there is a way 
of giving fermion masses without introducing an explicit mass term in the 
Lagrangian. Consider the electron, for example, and let us hypothesize a 
‘Yukawa’—type coupling between the electron-type SU(2) doublet 


l= ( Nes ) , (22.124) 
L 


the Higgs doublet 9, and the R-component of the electron field: 


ES = —ge (ler der T rp ler). (22.125) 


In each term of (22.125), the two SU(2)r, doublets are ‘dotted together’ so as 
to form an SU(2)z scalar, which multiplies the SU(2)r, scalar R-component. 
Thus (22.125) is SU(2)r-invariant, and the symmetry is preserved, at the 
Lagrangian level, by such a term. But now insert just the vacuum value 
(22.28) of œ into (22.125): we find the result 

v 


£X (vac) = —Ge grên + ener) (22.126) 


which is exactly a (Dirac) mass of the form (22.121), allowing us to make the 
identification 
Me = gev/V2. (22.127) 


When oscillations about the vacuum value are considered via the replace- 
ment (22.29), the term (22.125) will generate a coupling between the electron 
and the Higgs fields of the form 


—geeéH/V2 = —(m./v)éeH (22.128) 
= —~(gm./2Mw)ééH. (22.129) 


The presence of such a coupling, if present for the process ff — we Wo 
considered earlier, will mean that, in addition to the f-exchange graph analo- 
gous to figure 22.4 and the annihilation graphs of figure 22.23, a further graph 
shown in figure 22.24, must be included. The presence of the fermion mass in 
the coupling to H suggests that this graph might be just what is required to 
cancel the ‘bad’ high energy behaviour found in (22.123) — and by this time 
the reader will not be surprised to be told that this is indeed the case. 

At first sight it might seem that this stratagem will only work for the 
tą = —4 components of doublets, because of the form of (0|d|0). But we 


learned in section 12.1.3 that if a pair of states ( d 


be ) forming an SU(2) 
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FIGURE 22.24 
One-H annihilation graph. 


doublet transform by 


( ; ) prts : ) (22.130) 


then the charge conjugate states iTo ( p ) transform in exactly the same 
way. Thus if, in our case, ¢ is the SU(2) doublet 
l(A +3 \_ 3 
4 zg($1 —i$2) = oF 
dies val E 92) i (22.131) 
z(t — i91) = ó 
then the charge conjugate field 
* ; 2 (s + ids) jo 
dosingt=( VP 77 Jal € (22.132) 
-z T iġ2) —o 


is also an SU(2) doublet, transforming in just the same way as ¢. ((22.131) 
and (22.132) may be thought of as analogous to the (K+, K?) and (K9, K7) 
isospin doublets in SU(3);r). Note that the vacuum value (22.28) will now 
appear in the upper component of (22.132). With the help of óc we can write 
down another SU(2)-invariant coupling in the ve — e sector, namely 


-gv (lendover + Perobler), (22.133) 


assuming now the existence of the field fer. In the Higgs vacuum (22.28), 
(22.133) then yields 


—(gw.v/ V2) (Pat Per + Pene.) (22.134) 


which is precisely a (Dirac) mass for the neutrino, if we set g,,v/V/2 = m,. 
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It is clearly possible to go on like this, and arrange for all the fermions, 
quarks as well as leptons, to acquire a mass by the same ‘mechanism’. We will 
look more closely at the quarks in the next section. But one must admit to 
a certain uneasiness concerning the enormous difference in magnitudes repre- 
sented by the couplings gy,,...ge,--- gv. If mp, < 1eV then g,, < 1071}, while 
gt ~ 1! Besides, whereas the use of the Higgs field ‘mechanism’ in the W-Z 
sector is quite economical, in the present case it seems rather unsatisfactory 
simply to postulate a different ‘g’ for each fermion—Higgs interaction. This 
does appear to indicate that we are dealing here with a ‘phenomenological 
model’, once more, rather than a ‘theory’. 

As far as the neutrinos are concerned, however, there is another possibility, 
already discussed in sections 7.5.2, 20.3 and 21.4.1, which is that they could 
be Majorana (not Dirac) fermions. In this case, rather than the four degrees 
of freedom (VeL, Ver, and their antiparticles) which exist for massive Dirac 
particles, only two possibilities exist for neutrinos, which we may take to be 
VeL and veg. With these, it is certainly possible to construct a Dirac-type 
mass term of the form (22.134). But since, after all, the ver component has 
been assigned zero quantum members both for SU(2), W-interactions and for 
U(1) B-interactions (see table 22.1), we could consider economically dropping 
it altogether, making do with just the ver, component. 

Suppose, then, that we keep only the field >. We need to form a mass 
term for it. The charge-conjugate field is defined by (see (7.151)) 

(ere = iba = vy? 017, (22.135) 
and we know that the charge-conjugate field transforms under Lorentz trans- 
formations in the same way as the original field. So we can use (feL)c to form 
a Lorentz invariant 

(£ei)c VeL (22.136) 


which has mass dimension M3. Hence we may write a mass term for Per in 


the form i 
~ 9m [et )c Det + Per (Per.)c] (22.137) 


where the i is conventional. Written out in more detail, we have 


(Ber)o Pen = 23, (-iy? 0) Pet = PLi Y robe, (22.138) 


in the representation (20.14). Now 
ED —i02 0 
iy^yo = 0 | . (22.139) 


But since feg is an L-chiral field, only its 2 lower components are present (cf 
(20.26)) and (22.138) is effectively 


(£er)c DeL = oL (102) Per. (22.140) 
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This is just the form of the mass term for a Majorana field, as we saw in 
equation (7.159). The two formalisms are equivalent. 

As noted in section 21.4.1, the mass term (22.137) is not invariant under 
a global U(1) phase transformation 


Det > e I^ far, (22.141) 


which would correspond to lepton number (if accompanied by a similar trans- 
formation for the electron fields): the Majorana mass term violates lepton 
number conservation. 

There is a further interesting aspect to (22.140) which is that, since two 
DeL operators appear rather than a îe and a 7 (which would lead to Le 
conservation), the (t, t3) quantum numbers of the term are (1,1). This means 
that we cannot form an SU(2) invariant with it, using only the Standard 
Model Higgs 9, since the latter has t = i and cannot combine with the (1,1) 
operator to form a singlet. Thus we cannot make a ‘tree-level’ Majorana 
mass by the mechanism of Yukawa coupling to the Higgs field, followed by 
symmetry breaking. 

However, we could generate suitable ‘effective’ operators via loop correc- 
tions, perhaps, much as we generated an effective operator representing an 
anomalous magnetic moment interaction in QED (cf section 11.7). But what- 
ever it is, the operator would have to violate lepton number conservation, 
which is actually conserved by all the Standard Model interactions. Thus 
such an effective operator could not be generated in perturbation theory. It 
could arise, however, as a low energy limit of a theory defined at a higher 
mass scale, as the current-current model is the low energy limit of the GSW 
one. The typical form of such operator we need, in order to generate a term 
bl ico, is 

= a (inpo) Tio» (Abbot). (22.142) 
Note, most importantly, that the operator ‘(/¢)(@l)’ in (22.142) has mass 
dimension five, which is why we introduced the factor M! in the coupling; 
it is indeed a non-renormalizable effective interaction, just like the current- 
current one. We may interpret M as the mass scale at which ‘new physics’ 
enters, in the spirit of the discussion in section 11.7. Suppose, for the sake of 
argument, this was M ~ 1019 GeV (a scale typical of Grand Unified Theories). 
After symmetry breaking, then, (22.142) will generate the required Majorana 
mass term, with 


2 
mn ~ T, ~ ge 10? eV. (22.143) 


Thus an effective coupling of ‘natural’ size gem ~ 0.1 emerges from this 
argument, if indeed the mass of the ve is of order 10~%eV. 

A more specific model can be constructed in which a relation of the form 
(22.143) can arise naturally. Suppose Pg is an R-type neutrino field which is 
an SU(2) x U(1) singlet, and which has a gauge-invariant Yukawa coupling 
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to the Higgs field, of the form (22.133). Then the Yukawa and the mass terms 
DR are 


EEE E ee PED 
Ly.R = —gr(leLocie + Pol eu) = gmrl(?r)e VR + h.c.]. (22.144) 


Then, in the Higgs vacuum the first term in (22.144) becomes 
-mp (Per. bg + DRVeL) (22.145) 


where mp = gru/V2. The term (22.145) couples the fields ôr and Der, so 
that we need to do a diagonalization to find the true mass eigenvalues and 
eigenstates. The combined mass terms from (22.144) and (22.145) can be 
written as 


i= : 
-3 (Ge M Ny + he. (22.146) 


i, = ( ee iF M= ( a ie J (22.147) 


CP invariance would imply that the parameters mg and mp are real, as we 
will assume, for simplicity. 

Suppose now that mp < mpr. Then the eigenvalues of M are approxi- 
mately 


where 


mı mg, Mm% —m2/mg. (22.148) 


The apparently troubling minus sign can be absorbed into the mixing param- 
eters. Thus one eigenvalue is (by assumption) very large compared to mp, 
and one is very much smaller. The vanishing of the first element in M ensures 
that the lepton number violating term (22.137) is characterized by a large 
mass scale mg. It may be natural to assume that mp is a ‘typical’ quark or 
lepton mass term, which would then imply that mz of (22.148) is very much 
lighter than that — as appears to be true for the neutrinos. This is the famous 
‘see-saw’ mechanism of Minkowski (1977), Gell-Mann et al. (1979), Yanagida 
(1979) and Mohapatra and Senjanovic (1980, 1981). If in fact mg ~ 10!6Gev, 
we recover an estimate for m2 which is similar to that in (22.143). It is worth 
emphasizing that the Majorana nature of the massive neutrinos is an essential 
part of the see-saw mechanism. 

These considerations are tending to take us ‘beyond the Standard Model’, 
so we shall not pursue them at any greater length. Instead, we must now 
generalize the discussion of fermion masses to the three-generation case. 


22.5.2 Three-generation mixing 


We introduce three doublets of left-handed quark fields 


A ani x ûL2 ^ a3 
= M ; = ^ ; = ^ 22.149 
qui ( ies ) qL2 ( dii ) qL3 ( dis ) ( ) 
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and the corresponding six singlets 
iri, dri, tra, dro, irs, dns, (22.150) 


which transform in the now familiar way under SU(2)z x U(1). The á-fields 
correspond to the t3 = +4 components of SU(2)r, the d ones to the t3 = -i 
components, and to their ‘R’ partners. The labels 1, 2 and 3 refer to the 
family number; for example, with no mixing at all, dp. = wz, dii = dz, 
etc. We have to consider what is the most general SU(2), x U(1)-invariant 
interaction between the Higgs field (assuming we can still get by with only one) 
and these various fields. Apart from the symmetry, the only other theoretical 
requirement is renormalizability — for, after all, if we drop this we might as well 
abandon the whole motivation for the ‘gauge’ concept. This implies (as in the 


discussion of the Higgs potential V) that we cannot have terms like (77)? 
appearing — which would have a coupling with dimensions (mass)~* and would 
be non-renormalizable. In fact the only renormalizable Yukawa coupling is of 
the form b, which has a dimensionless coupling (as in the ge and g,, of 
(22.125) and (22.133)). However, there is no a priori requirement for it to be 
‘diagonal’ in the weak interaction family index 7. The allowed generalization 
of (22.125) and (22.133) is therefore an interaction of the form (summing on 
repeated indices) 


Lug = aijdr; oun;  bijdy; dn; + h.c. (22.151) 
where 
: ahi 
dui = ( ; ) Ë (22.152) 
dri 


and a sum on the family indices 4 and j (from 1 to 3) in (22.151) is assumed. 
After symmetry breaking, using the gauge (22.29), we find (problem 22.6) 


A HY. = : 
Lt = ( + 4) [à ;m;;ün; + d;mj;dg; +h.c.], (22.153) 


where the ‘mass matrices’ are 

U d U 
Although we have not indicated it, the m" and m2 matrices could involve 
a ‘ys’ part as well as a ‘1’ part in Dirac space. It can be shown (Weinberg 
1973, Feinberg et al. 1959) that m" and m? can both be made Hermitean, 
^s-free, and diagonal by making four separate unitary transformations on the 
‘generation triplets’ 


a, = ua ^ dL = dia ,etc (22.155) 
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via 
dita = (U)aitra, ^ tra = (UL) avin (22.156) 
da = (Uf))idu, ^ dra = (UP Jairi. (22.157) 


In this notation, ‘a’ is the index of the ‘mass diagonal’ basis, and ‘i’ is that 
of the ‘weak interaction’ basis.? Then (22.153) becomes 


n H = x. 
Low =- (: i 2) [my üà +... + mybb]. (22.158) 
v 


Rather remarkably, we can still manage with only the one Higgs field. It 
couples to each fermion with a strength proportional to the mass of that 
fermion, divided by Mw. 

Now consider the SU(2), xU(1) gauge invariant interaction part of the 
Lagrangian. Written out in terms of the ‘weak interaction’ fields ûL ri and 
dı ni (cf (22.43) and (22.44)), it is 


^ s TA . x . D aj 
Lyw.p = i(üy;, drj)" (Ou TagT- W,/2 + ig'yB,/2) ( d ) 
J 
+ inj" (8, + ig'yB,/2)ün; + idr” (ðu + ig'yB, /2)dnj 
(22.159) 


where a sum on j is understood. This now has to be rewritten in terms of the 
mass-eigenstate fields ĉL Ra and dL Ra. 

Problem 22.7 shows that the neutral current part of (22.159) is diagonal in 
the mass basis, provided the U matrices of (22.156) and (22.157) are unitary; 
that is, the neutral current interactions do not change the flavour of the physi- 
cal (mass eigenstate) quarks. The charged current processes, however, involve 
the non-diagonal matrices 7, and 7» in (22.159), and this spoils the argument 
used in problem 22.7. Indeed, using (22.47) we find that the charged current 
piece is 


2 9 ,z E P an; 
= unit rd : h.c. 
Loc fy a sri, ( d; )+ c 
g x ae 
= cum Wip, 
= --3 (QUO) (UO sy dus W,-rh.e, (22.160) 
/2 
where the matrix 
Vag = UP EE S (22.161) 


is not diagonal, though it is unitary. This is the CKM matrix (Cabibbo 


3S0, for example, ûLa=t = ÎL, dLa=s = áp, etc. 
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1963, Kobayashi and Maskawa 1973), originally introduced by Kobayashi 
and Maskawa in the context of their three-generation extension of the then- 
developing Standard Model, in order to provide room for CP violation within 
the SU(2) x U(1) gauge theory framework. The interaction (22.160) then has 
the form 


Eur any, + & ^81, + Ey" b] + hc., (22.162) 
where x 7 
a Vaa Vas Vab dr, 

$5, |—| Vea Ve Veb su |, (22.163) 
bt, Via Vis Vib br, 


with the phenomenology described in the previous chapter. 

An analysis similar to the above can be carried out in the leptonic sector. 
We would then have leptonic flavour mixing in charged current processes, 
via the leptonic analogue of the CKM matrix, namely the PMNS matrix 
(Pontecorvo 1957, 1958, 1967; Maki, Nakagawa and Sakata 1962); this is 
the matrix whose elements are probed in neutrino oscillations, as we saw in 
chapter 21. 


E: SeSe 
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The Z° mass 
Mz = 91.1876 + 0.0021 GeV (22.164) 


has been determined from the Z-lineshape scan at LEP1 (Schael et al. 2006). 
The W mass is (Nakamura et al. 2010) 


Mw = 80.399 + 0.023 GeV. (22.165) 


The asymmetry parameter Ae (see (22.100)) is (Abe et al. (2000)) 


Ae = 0.15138 + 0.00216 (22.166) 


from measurements at SLD. These are just three examples from the table 
of 36 observables listed in the review of the electroweak model by Erler and 
Langacker in Nakamura et al. (2010). Such remarkable precision is a tri- 
umph of machine design and experimental art — and it is the reason why we 
need a renormalizable electroweak theory. The overall fit to the data, includ- 
ing higher-order corrections, is generally very good, as quoted by Erler and 
Langacker with y?/d.o.f = 43.0/44. One of the few discrepancies is a 2.7c de- 
viation in the Z-pole forward-backward asymmetry AG») from LEP1; another 
is a 2.50 deviation in the muon anomalous magnetic moment, gą — 2. This 
strong numerical consistency lends impressive support to the belief that we 
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are indeed dealing with a renormalizable spontaneously broken gauge theory: 
renormalizable, because no extra parameters, not in the original Lagrangian, 
have had to be introduced; a gauge theory, because the fermion and gauge 
boson couplings obey the relations imposed by the local SU(2) x U(1) sym- 
metry; and spontaneously broken because the same symmetry is not seen in 
the particle spectrum (consider the mass separation in the t-b doublet, for 
instance). 

In fact, one can turn this around, in more than one way. First, one crucially 
important element in the theory — the Higgs boson — has a mass mg which 
is largely unconstrained by theory (see section 22.8.2), and it is therefore a 
parameter in the fits. Some information about mg can therefore be gained by 
seeing how the fits vary with mg. Actually, we shall see in equation (22.181) 
that the dependence on my is only logarithmic — it acts rather like a cut-off, 
so the fits are not very sensitive to my. The 90 96 central confidence range 
from all precision data is given by Erler and Langacker as 55 GeV< my < 
135 GeV. By contrast, some loop corrections are proportional to the square 
of the top mass (see (22.180)) and consequently very tight bounds could be 
placed on m, via its virtual presence (i.e. in loops, for example as shown 
in figure 22.25) before its real presence was confirmed, as we shall discuss 
shortly and in section 22.7. Secondly, it is still entirely possible that very 
careful analysis of small discrepancies between precision data and electroweak 
predictions may indicate the presence of ‘new physics’. 

After all this (and earlier) emphasis on the renormalizability of the elec- 
troweak theory, and the introduction to one-loop calculations in QED at the 
end of volume 1, the reader perhaps has a right to expect, now, an exposition 
of loop corrections in the electroweak theory. But the fact is that this is a very 
complicated and technical story, requiring quite a bit more formal machinery, 
which would be outside the intended scope of this book (suitable references in- 
clude Altarelli et al. 1989, especially the pedagogical account by Consoli et al. 
1989; and the equally approachable lectures by Hollik 1991). Instead, we want 
to touch on just a few of the simpler and more important aspects of one-loop 
corrections, especially insofar as they have phenomenological implications. 

As we have seen, we obtain cut-off independent results from loop correc- 
tions in a renormalizable theory by taking the values of certain parameters — 
those appearing in the original Lagrangian — from experiment, according to 
a well-defined procedure (‘renormalization scheme’). In the electroweak case, 
the parameters in the Lagrangian are 


gauge couplings g, g' 22.167) 

Higgs potential parameters A, u? 22.168) 
Higgs-fermion Yukawa couplings 9g; 22.169) 

CKM angles 612, 613, 923; phase 6 (22.170) 

PMNS angles 6.2, 663,93, phase 6” (+ a21, 317). 22.171) 
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The fermion masses and mixings, and the Higgs mass, can be separated off, 
leaving g, g' and one combination of A and u? (for instance, the tree-level vac- 
uum value v). These three parameters are usually replaced by the equivalent 
and more convenient set 


a (Bouchiendra et al. 2011); (22.172) 


Gp (Marciano and Sirlin 1988, van Ritbergen and Stuart 1999), (22.173) 


(see also Nir 1989, Pak and Czarnecki 2008, Chitwood et al. 2007, and Barezyk 
et al. 2008); and 
Mz (Schael et al. 2006). (22.174) 


These are, of course, related to g, g' and v; for example, at tree-level 


15,5 1 
a= gg /(g? + g^)4m, Mz mE z” g? gl. GF = VASE (22.175) 


but these relations become modified in higher order. The renormalized pa- 
rameters will ‘run’ in the way described in chapters 15 and 16; the running of 
a, for example, has been observed directly, as noted in section 11.5. 

After renormalization one can derive radiatively corrected values for phys- 
ical quantities in terms of the set (22.172)-(22.174) (together with mg and the 
fermion masses and mixings). But a renormalization scheme has to be speci- 
fied, at any finite order (though in practice the differences are very small). One 
conceptually simple scheme is the ‘on-shell’ one (Sirlin 1980, 1984; Kennedy 
et al. 1989; Kennedy and Lynn 1989; Bardin et al. 1989; Hollik 1990; for 
reviews see Langacker 1995). In this scheme, the tree-level formula 


sin? Ow —1— Mj /Mz (22.176) 


is promoted into a definition of the renormalized sin? @w to all orders in per- 
turbation theory, it being then denoted by sy: 


sty = 1 — MG, /MzZ 0223. (22.177) 
The radiatively corrected value for Mw is then 


Me, = E (22.178) 


where Ar includes the radiative corrections relating c, &(Mz), Ge, Mw and 
Mz. Another scheme is the modified minimal subtraction (MS) scheme (ap- 
pendix O) which introduces the quantity sin? 6w (u) = 9? (u)/ [9 (u) + 9? (u)] 
where the couplings g and @’ are defined in the MS scheme and yp is cho- 
sen to be Mz for most electroweak processes. Attention is then focused on 
82 = sin? dw (Mz). This is the scheme used by Erler and Langacker in Naka- 
mura et al (2010). 
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We shall continue here with the scheme defined by (22.177). We cannot go 
into detail about all the contributions to Ar, but we do want to highlight two 
features of the result — which are surprising, important phenomenologically, 
and related to an interesting symmetry. It turns out (Consoli et al. 1989, 
Hollik 1991) that the leading terms in Ar have the form 


(1 = sw) 
Ar = Aro — ——3—— Ap + (Ar)rem- (22.179) 
Sw 
In (22.179), Aro = 1 — a/a(Mz) is due to the running of a, and has the value 
Arg — 0.0664(2) (see section 11.5.3). Ap is given by (Veltman 1977) 
_ 3Gp(mi — mi) 
7 8124/2 : 


while the ‘remainder’ (Ar);em contains a significant term proportional to 
In(m4/mz), and a contribution from the Higgs boson which is (for my > Mw) 


^p (22.180) 


V2Gr MZ, 11 m? 5 
(Ar) rem,H © E E. [m (z+) - d (22.181) 


As the notation suggests, Ap is a leading contribution to the parameter p 
introduced in (22.66). As explained there, it measures the strength of neutral 
current processes relative to charged current ones. Ap is then a radiative cor- 
rection to p. It turns out that, to good approximation, electroweak radiative 
corrections in e*e" — Z° + f f can be included by replacing the fermionic 
couplings gf and gf (see (22.64), (22.75) and (22.76)) by 


aj = SPFS — 2Q pst) (22.182) 
and 
ah = verti. (22.183) 


together with corrections to the Z°-propagator. The corrections have the 
form (in the on-shell scheme) py ~ 1+ Ap (of equation (22.180)) and Ky % 
1+ as Ae, for f #b,t. For the b-quark there is an additional contribution 
coming from the presence of the virtual top quark in vertex corrections to 
Z — bb (Akhundov et al. 1986, Beenakker and Hollik 1988). 

The running of a in Arg is expected, but (22.180) and (22.181) contain 
surprising features. As regards (22.180), it is associated with top-bottom 
quark loops in vacuum polarization amplitudes, of the kind discussed for me 
in section 11.5, but this time in weak boson propagators. In the QED case, re- 
ferring to equation (11.39) for example, we saw that the contribution of heavy 
fermions '(|g^| < m4)’ was suppressed, appearing as O(|g?|/m 7). In such a 
situation (which is the usual one) the heavy particles are said to ‘decouple’. 
But the correction (22.180) is quite different, the fermion masses being in the 
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t 


FIGURE 22.25 
t - b vacuum polarization contribution. 


numerator. Clearly, with a large value m4, this can make a relatively big dif- 
ference. This is why some precision measurements are surprisingly sensitive 
to the value of m, in the range near (as we now know) the physical value. 
Secondly, as regards the dependence on mg, we might well have expected it 
to involve mz, in the numerator if we considered the typical divergence of a 
scalar particle in a loop (we shall return to this after discussing (22.180)). Ar 
would then have been very sensitive to mg, but in fact the sensitivity is only 
logarithmic. 

We can understand the appearance of the fermion masses (squared) in the 
numerator of (22.180) as follows. The shift Ap is associated with vector boson 
vacuum polarization contributions, for example the one shown in figure 22.25. 
Consider in particular the contribution from the longitudinal polarization 
components of the W's. As we have seen, these components are nothing but 
three of the four Higgs components which the WF and Z° ‘swallowed’ to be- 
come massive. But the couplings of these ‘swallowed’ Higgs fields to fermions 
are determined by just the same Higgs-fermion Yukawa couplings as we in- 
troduced to generate the fermion masses via spontaneous symmetry breaking. 
Hence we expect the fermion loops to contribute (to these longitudinal W 
states) something of order 9; /An where gs is the Yukawa coupling. Since 
gf ~ mr /v (see (22.127)) we arrive at an estimate ~ m /Amv? ~ Grmi [An 
as in (22.180). An important message is that particles which acquire their 
mass spontaneously do not ‘decouple’. 

But we now have to explain why Ap in (22.180) would vanish if m? = m2, 
and why only Inm?, appears in (22.181). Both these facts are related to 
a symmetry of the assumed minimal Higgs sector which we have not yet 
discussed. Let us first consider the situation at tree level, where p — 1. It 
may be shown (Ross and Veltman 1975) that p = 1 is a natural consequence 
of having the symmetry broken by an SU(2), doublet Higgs field (rather than 
a triplet, say) — or indeed by any number of doublets. The nearness of the 
measured p parameter to 1 is, in fact, good support for the hypothesis that 
there are only doublet Higgs fields. Problem 22.8 explores a simple model 
with a Higgs field in the triplet representation. 

At tree level, it is simplest to think of p in connection with the mass ratio 
(22.66). To see the significance of this, let us go back to the Higgs-gauge field 
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Lagrangian Loe» of (22.30) which produced the gauge boson masses. With 
the doublet Higgs of the form (22.131), it is a striking fact that the Higgs 
potential only involves the highly symmetrical combination of fields 


à 4292-93, (22.184) 


as does the vacuum condition (17.102). This suggests that there may be 
some extra symmetry in (22.30) which is special to the doublet structure. 
But of course, to be of any interest, this symmetry has to be present in the 
(D,0) (D^) term as well. 

The nature of this symmetry is best brought out by introducing a change 
of notation for Higgs doublet o* and ¢°: instead of (22.131), we now write 
(cf (18.70)) 


; f (fa it)/ 2 
o= ( (6 —its)/V2 ) (22.185) 
while the dc field of (22.132) becomes 
~ ( (6+ it3)/V2 
óc = ( ig — tes) Jj (22.186) 


We then find that these can be written as 


b= Ze ira) ( i) ác = oe ein m( s). (22.187) 


Consider now the covariant SU(2), x U(1) derivative acting on ¢, as in (22.30), 
and suppose to begin with that g’ = 0. Then 


ec 1 $ re EE 
Due = Orien Wu eie m (1) 
SEI WE NT 
= g 6 tir Ont tigor W, 


2 Sit W, eir W, x 8] ( A ) (22.188) 


using T;7; = dij + i€ijnTe. Now the vacuum choice (22.28) corresponds to 
& =v,% = 0, so that when we form (D,.¢)'(D"@) from (22.188) we will get 
just 


ronf er Wy yr: w^) ( : ) = saw, -W" (22.189) 


with Mw = gv/2 as usual. The condition g' = 0 corresponds (cf (22.39)) 
to AN = 0, and thus to W3, = Z,, and so (22.189) says that in the limit 
of g^ > 0, Mw = Mz, as expected if cos@w = 1. It is clear from (22.188) 
that the three components W, are treated on a precisely equal footing by the 
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Higgs field (22.185), and indeed the notation suggests that W, and 7t should 
perhaps be regarded as some kind of new triplets. 

It is straightforward to calculate (D 5,0)! (D" à) from (22.188); one finds 
(problem 22.9) 


. 1 1 
(D)! D'ó = z(0,8) + z(8,8) — 20,65 Ww 
+ 260,8 W^ + S08 (8 x W^) 
9" a 
4 Wale +ê’). (22.190) 


This expression now reveals what the symmetry is: (22.190) is invariant under 
global SU(2) transformations under which W ,, and 7 are vectors — that is 
W, =} W, tex W, 
$—4Lexd (22.191) 
ô —6 


This is why, from the term We, all three W fields have the same mass in 
this g^ — 0 limit. 

If we now reinstate g’, and use (22.36) and (22.37) to write Wa, and B, 
in terms of the physical fields Z, and Au as in (19.96), (22.188) becomes 


1 UT LA ADs Z5 1-473 5 

Lou tig BW, eia, + iot iE norm +igsin de ( 5 JÀ, 
1 A 

sin? on ( 232, 6 ie ( i ) (22.192) 


We see from (22.192) that g’ Æ 0 has two effects. First, there is a ‘r - W^- 
like term, as in (22.188), except that the ‘W3’ part of it is now Z/cosOw. 
In the vacuum ô = v,7t = 0 which simply means that the mass of the Z is 
Mz = Mw/ cos Ow i.e. p = 1; and this relation is preserved under ‘rotations’ 
of the form (22.191), since they do not mix 7t and 6. Hence this mass relation 
(and p — 1) is a consequence of the global SU(2) symmetry of the interactions 
and the vacuum under (22.191), and of the relations (22.36) and (22.37) which 
embody the requirement of a massless photon. 

On the other hand, there are additional terms in (22.192) which single 
out the ‘T3’ component, and therefore break this global SU(2). These terms 
vanish as g' — 0, and do not contribute at tree level, but we expect that they 
will cause O(g?) corrections to p = 1 at the one loop level. 

None of ule above, however, yet involves the quark masses, and the ques- 
tion of why m2 — b? appears in the numerator in (22.180). We can now answer 
this question. Concer a typical mass term, of the form discussed in section 
22.5.2, for a quark doublet of the i‘? family 


cos Ow 


Lm = —9+ (ür;dr;)ócün; = g—(auidti)ddpi. (22.193) 
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Using (22.185) and (22.186), this can be written as 


E —94 75 3 va af tri I- (5 3 MA L.S A 0 
B. = e udo ine o ) - vg uenis m) ( dri ) 
icc debe x jaune Ri 
= EL mE dd . ft)Ta ( dri ) ; (22.194) 


Consider now a simultaneous (infinitesimal) global SU(2) transformation 
on the two doublets (ür;,dr;)' and (tpi, dri)?: 


ûLi i ûLi ÛRi " ÛRi 
s — je- A À l—ie-7/2 x , 
(i \sa-ier(H ), ()5a-ess( iW) 
(22.195) 
Under (22.195), the first term of (22.194) becomes (to first order in €) 


(g+ +9-) 
2/2 


From (22.196) we see that if, at the same time as (22.195), we also make 
the transformation of 7 given in (22.191), then this first term in Êm will be 
invariant under these combined transformations. The second term in (22.194), 
however, will not be invariant under (22.195), but only under transformations 
with e; = €» = 0,e3 Æ 0. We conclude that the global SU(2) symmetry of 
(22.191), which was responsible for p = 1 at the tree level, can be extended 
also to the quark sector; but — because the g+ in (22.193) are proportional to 
the masses of the quark doublet — this symmetry is explicitly broken by the 
quark mass difference. This is why a t-b loop in a W vacuum polarization 
correction can produce the ‘non-decoupled’ contribution (22.180) to p, which 
grows as m? — m? and produces quite detectable shifts from the tree-level 
predictions, given the accuracy of the data. 

Returning to (22.195), the transformation on the L-components is just 
the same as a standard SU(2), transformation, except that it is global; so 
the gauge interactions of the quarks obey this symmetry also. As far as 
the R-components are concerned, they are totally decoupled in the gauge 
dynamics, and we are free to make the transformation (22.195) if we wish. 
'The resulting complete transformation, which does the same to both the L 
and R components, is a non-chiral one — in fact it is precisely an ordinary 
‘isospin’ transformation of the type 


( d, ) —> (1 — ie- 7/2) ( d ) : (22.197) 
The reader will recognize that the mathematics here is exactly the same as 
that in section 18.3, involving the SU(2) of isospin in the o-model. This 


-— (à ;di;)[ó +it-(7#+7 xe) ( ) . (22.196) 
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FIGURE 22.26 - 
One-boson self-energy graph in (¢'¢)?. 


analysis of the symmetry of the Higgs ( or a more general symmetry breaking 
sector) was first given by Sikivie et al. (1980). The isospin-SU(2) is frequently 
called ‘custodial SU(2)' since it ‘protects’ p = 1. 

What about the absence of mj, corrections? Here the position is rather 
more subtle. Without the Higgs particle H the theory is non-renormalizable, 
and hence one might expect to see some radiative correction becoming very 
large (O(m2,)) as one tried to ‘banish’ H from theory by sending my — oo 
(my would be acting like a cut-off). The reason is that in such a (416)? theory, 
the simplest loop we meet is that shown in figure 22.18, and it is easy to see by 
counting powers, as usual, that it diverges as the square of the cut-off. This 
loop contributes to the Higgs self-energy, and will be renormalized by taking 
the value of the coefficient of été in (22.30) from experiment. We will return 
to this particular detail in section 22.8.1. 

Even without a Higgs contribution however, it turns out that the elec- 
troweak theory is renormalizable at the one-loop level if the fermion masses are 
zero (Veltman 1968,1970). Thus one suspects that the large mj, effects will not 
be so dramatic after all. In fact, calculation shows (Veltman 1977; Chanowitz 
et al. 1978, 1979) that one-loop radiative corrections to electroweak observ- 
ables grow at most like In m2, for large mg. While there are finite corrections 
which are approximately O(m) for mj, < MẸ z for mj > My z the O(mii) 
pieces cancel out from all observable quantities*, leaving only In m2, terms. 
This is just what we have in (22.181), and it means, unfortunately, that the 
sensitivity of the data to this important parameter of the Standard Model is 
only logarithmic. Fits to data typically give my in the region of 90 GeV at 
the minimum of the y? curve, but the error (which is not simple to interpret) 
is of the order of 25 GeV. 

At the two-loop level, the expected O(mjj) behaviour becomes O(m) 
instead (van der Bij and Veltman 1984, van der Bij 1984) — and of course 
appears (relative to the one-loop contributions) with an additional factor of 
O(a). This relative insensitivity of the radiative corrections to my, in the 
limit of large my, was discovered by Veltman (1977) and called a ‘screening’ 
phenomenon by him: for large mg (which also means, as we have seen, large 
A) we have an effectively strongly interacting theory whose principal effects are 


^ Apart from the été coefficient! See section 22.8.1. 
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screened off from observables at lower energy. It was shown by Einhorn and 
Wudka (1989) that this screening is also a consequence of the (approximate) 
isospin-SU(2) symmetry we have just discussed in connection with (22.180). 
Phenomenologically, the upshot is that it was unfortunately very difficult to 
get an accurate handle on the value of my from fits to the precision data. 
With the top quark, the situation was very different. 


E 


22.7 The top quark 


Having drawn attention to the relative sensitivity of radiative connections to 
loops containing virtual top quarks, it is worth devoting a little space to a 
‘backward glance’ at the year immediately prior to the discovery of the t- 
quark (Abe et al. 1994a, b, 1995b, Abachi et al. 1995b) at the CDF and DO 
detectors at FNAL’s Tevatron, in p — p collisions at Ean = 1.8 TeV. 

The W and Z particles were, as we have seen, discovered in 1983 and at 
that time, and for some years subsequently, the data were not precise enough 
to be sensitive to virtual t-effects. In the late 1980’s and early 1990’s, LEP at 
CERN and SLC at Stanford began to produce new and highly accurate data 
which did allow increasingly precise predictions to be made for the top quark 
mass, m4. Thus a kind of race began, between experimentalists searching for 
the real top, and theorists fitting ever more precise data to get tighter and 
tighter limits on m, from its virtual effects. 

In fact, by the time of the actual experimental discovery of the top quark, 
the experimental error in m, was just about the same as the theoretical one 
(and — of course - the central values were consistent). Thus, in their May 
1994 review of the electroweak theory (contained in Montanet et al. 1994, p 
1304ff) Erler and Langacker gave the result of a fit to all electroweak data as 


m = 169 +1 +47 GeV, (22.198) 


the central figure and first error being based on mg = 300 GeV, the second 
(+) error assuming mg = 1000 GeV and the second (—) error assuming my = 
60 GeV.? At about the same time, Ellis et al. (1994) gave the extraordinarily 
precise value 


my = 162 9 GeV (22.199) 


without any assumption for mg. 

A month or so earlier, the CDF collaboration (Abe et al. 1994a,b) an- 
nounced 12 events consistent with the hypothesis of production of a tt pair, 
and on this hypothesis the mass was found to be 


m, — 174 10 +13 GeV, (22.200) 


5The relatively small effect of large variations in my illustrates the lack of sensitivity to 
virtual Higgs effects, noted in the preceding section. 
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and this was followed by nine similar events from DO (Abachi et al. 1995a). 
By February 1995 both groups had amassed more data and the discovery was 
announced (Abe et al. 1995b, Abachi et al. 1995b). The 2010 experimental 
value for m is 173.141.3 GeV (Nakamura et al. 2010) as compared to the value 
predicted by fits to the electroweak data of 173.2+1.3 GeV. This represents an 
extraordinary triumph for both theory and experiment. It is surely remarkable 
how the quantum fluctuations of a yet-to-be-detected new particle could pin 
down its mass so precisely. It seems hard to deny that Nature has indeed made 
use of the subtle intricacies of a spontaneously broken non-Abelian gauge 
theory. 

One feature of the ‘real’ top events in particularly noteworthy. Unlike the 
mass of the other quarks, m, is greater than Mw, and this means that it can 
decay to b + W via real W emission: 


t— WT +b. (22.201) 


In contrast, the b quark itself decays by the usual virtual W processes. Now we 
have seen that the virtual process is supressed by ~ 1/M$, if the energy release 
(as in the case of b-decay) is well below Mw. But the real process (22.201) 
suffers no such suppression and proceeds very much faster. In fact (problem 
22.10) the top quark lifetime from (22.201) is estimated to be ~ 4 x 107^ 
s! This is quite similar to the lifetime of the W^ itself, via WT — eve for 
example. Consider now the production of a tt pair in the collision between 
two partons. As the t and t separate, the strong interactions which should 
eventually ‘hadronize’ them will not play a role until they are ~ 1 fm apart. 
But if they are travelling close to the speed of light, they can only travel some 
107186 m before decaying. Thus t's tend to decay before they experience the 
confining QCD interactions, a point we also made in section 1.2. Instead, the 
hadronization is associated with the b quark, which has a more typical weak 
lifetime (~ 1.5 x 1071? s). By the same token, this fast decay of the t quark 
means that there will be no detectable tt ‘toponium’, bound by QCD. 

With the t quark safely real, the Higgs boson was the one remaining miss- 
ing particle in the Standard Model complement, and its discovery was of the 
utmost importance. We end this book with a brief review of Higgs physics 
and the experiments leading to the probable discovery of this long-awaited 
particle in 2012. 


E 


22.8 The Higgs sector 


It is worth noting that an essential feature of the type of theory which has 
been described in this note is the prediction of incomplete multiplets of 
scalar and vector bosons. 


—P W Higgs (1964) 
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22.8.1 Introduction 


The Lagrangian for an unbroken SU(2), x U(1) gauge theory of vector bosons 
and fermions is rather simple and elegant, all the interactions being deter- 
mined by just two Lagrangian parameters g and g’ in a ‘universal’ way. All 
the particles in this hypothetical world are, however, massless. In the real 
world, while the electroweak interactions are undoubtedly well described by 
the SU(2)r x U(1) theory, neither the mediating gauge quanta (apart from the 
photon) nor the fermions are massless. They must acquire mass in some way 
that does not break the gauge symmetry of the Lagrangian, or else the renor- 
malizability of the theory is destroyed, and its remarkable empirical success (at 
a level which includes loop corrections) would be some kind of freak accident. 
In chapter 19 we discussed how such a breaking of a gauge symmetry does 
happen, dynamically, in a superconductor. In that case ‘electron pairing’ was 
a crucial ingredient. In particle physics, while a lot of effort has gone into ex- 
amining various analogous ‘dynamical symmetry breaking’ theories, none has 
yet emerged as both theoretically compelling and phenomenologically viable. 
However, a simple count of the number of degrees of freedom in a massive vec- 
tor field, as opposed to a massless one, indicates that some additional fields 
must be present in order to give mass to the originally massless gauge bosons. 
And so, in the Standard Model, it is simply assumed, following the original 
ideas of Higgs and others (Higgs 1984, Englert and Brout 1964, Guralnik et 
al. 1964; Higgs 1966) that a suitable scalar (‘Higgs’) field exists, with a po- 
tential which causes the ground state (the vacuum) to break the symmetry 
spontaneously. Furthermore, rather than (as in BCS theory) obtaining the 
fermion mass gaps dynamically, they too are put in ‘by hand’ via Yukawa-like 
couplings to the Higgs field. 


It has to be admitted that this part of the Standard Model appears to be 
the least satisfactory. Consider the Higgs couplings, which are listed in ap- 
pendix Q, section Q.2.3. While the couplings of the Higgs field to the gauge 
fields are all determined by the gauge symmetry, the Higgs self-couplings 
(trilinear and quadrilinear) are not gauge interactions and are unrelated to 
anything else in the theory. Likewise, the Yukawa-like fermion couplings are 
not gauge interactions either, and they are both unconstrained and uncom- 
fortably different in orders of magnitude. True, all these are renormalizable 
couplings — but this basically means that their values are not calculable and 
have all to be taken from experiment. 


Such considerations may indicate that the ‘Higgs Sector’ of the Standard 
Model is on a somewhat different footing from the rest of it — a commonly held 
view, indeed. Perhaps it should be regarded as more a ‘phenomenology’ than 
a ‘theory’, much as the current-current model was. In this connection, we may 
mention a point which has long worried many theorists. In section 22.6 we 
noted that figure 22.26 gives a quadratically divergent (O(A2)) and positive 
contribution to the bid term in the Lagrangian, at one loop order. This term 
would ordinarily, of course, be just the mass term of the scalar field. But in 
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the Higgs case, the matter is much more delicate. The whole phenomenology 
depends on the renormalized coefficient having a negative value, triggering 
the spontaneous breaking of the symmetry. This means that the O(A?) one- 
loop correction must be cancelled by the ‘bare’ mass term Imi oft so as to 
achieve a negative coefficient of order —v?. This cancellation between hito 
and A? will have to be very precise indeed if A — the scale of ‘new physics’ — 
is very high, as is commonly assumed (say 10° GeV). 

The reader may wonder why attention should now be drawn to this par- 
ticular piece of renormalization: aren't all divergences handled this way? In a 
sense they are, but the fact is that this is the first case we have had in which 
we have to cancel a quadratic divergence. The other mass-corrections have all 
been logarithmic, for which there is nothing like such a dramatic ‘fine-tuning’ 
problem. There is a good reason for this in the case of the electron mass, 
which we remarked on in section 11.2. Chiral symmetry forces self-energy 
corrections for fermions to be proportional to their mass, and hence to con- 
tain only logarithms of the cut-off. Similarly, gauge invariance for the vector 
bosons prohibits any O(A?) connections in perturbation theory. But there 
is no symmetry, within the Standard Model, which ‘protects’ the coefficient 
of ord in this way. It is hard to understand what can be stopping it from 
being of order A?, if we take the apparently reasonable point of view that the 
Standard Model will ultimately fail at some scale A where new physics enters. 
Thus the difficulty is: why is the empirical parameter v ‘shielded’ from the 
presumed high scale of new physics? This ‘problem’ is often referred to as the 
‘hierarchy problem’, or the ‘fine-tuning problem’. We stress again that we are 
dealing here with an absolutely crucial symmetry-breaking term, which one 
would really like to understand far better. 

Of course, the problem would go away if the scale A were as low as, say 
a few TeV. As we shall see in the next section this happens to be, not ac- 
cidentally, the same scale at which the Standard Model ceases to be a per- 
turbatively calculable theory. Various possibilities have been suggested for 
the kind of physics that might enter at energies of a few TeV. For example, 
‘technicolour’ models (Peskin 1997) regard the Higgs field as a composite of 
some new heavy fermions, rather like the BCS-pairing idea referred to ear- 
lier. A second possibility is supersymmetry (Aitchison 2007), in which there 
is a ‘protective’ symmetry operating, since scalar fields can be put alongside 
fermions in supermultiplets, and benefit from the protection enjoyed by the 
fermions. A third possibility is that of ‘large’ extra dimensions (Antoniadis 
2002). 

These undoubtedly fascinating ideas obviously take us well beyond our 
proper subject to which we must now return. Whatever may lie ‘beyond’ 
it, the Lagrangian of the Higgs sector of the Standard Model leads to many 
perfectly definite predictions which may be confronted with experiment, as we 
shall briefly discuss in section 22.8.3 (for a full account see Dawson et al. 1990, 
and for more compact ones see Ellis, Stirling and Webber 1996, chapter 11, 
and the review by Bernardi, Carena and Junk in Beringer et al. 2012). The 
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elucidation of the mechanism of gauge symmetry breaking is undoubtedly of 
the greatest importance to particle physics: quite apart from the SU(2), x U(1) 
theory, very many of the proposed theories which go ‘beyond the Standard 
Model’ face a similar ‘mass problem’, and generally appeal to some variant of 
the ‘Higgs mechanism’ to deal with it. 

As Higgs noted in the final paragraph of his 2-page Letter (Higgs 1964), an 
essential feature of the spontaneous symmetry breaking mechanism, in a gauge 
theory, is the appearance of incomplete multiplets of both scalar and vector 
bosons. Let us just rehearse this once more, in the SU(2) x U(1) case. We 
started with 4 massless gauge fields, belonging to an SU(2) triplet and a U(1) 
singlet; and, in addition, 4 scalar fields of equal mass, in an SU(2) doublet. 
After symmetry breaking, three massive vector bosons emerged, leaving the 
photon massless. In the scalar sector, three of the scalars became the longi- 
tudinal components of the three massive vector bosons, and one lone massive 
scalar field survived, all that remained of the original scalar doublet. Its mass 
is a free parameter of the theory, being given by my = V2u = VAv /V2. The 
discovery — or otherwise — of this Higgs boson has therefore been a vital goal in 
particle physics for over forty years. Before turning to experiment, however, 
we want to mention some theoretical considerations concerning my by way of 
orientation. 


22.8.2 Theoretical considerations concerning my 


The coupling constant A, which determines my given the known value of v, 

is unfortunately undetermined in the Standard Model. However, some quite 

strong theoretical arguments suggest that my cannot be arbitrarily large. 
Like all coupling constants in a renormalizable theory, A must ‘run’. For 

the (oto? interaction of (22.30), a one-loop calculation of the 8-function leads 

to 

A(v) 


1— 30) in(B/v) 


872 


ME) = : (22.202) 
Like QED, this theory is not asymptotically free: the coupling increases with 
the scale FE. In fact, the theory becomes non-perturbative at the scale E* such 
that 3 
87 
E* ~ vexp| —— }. 22.203 
1C) vege 
Note that this is exponentially sensitive to the ‘low-energy’ coupling constant 
A(v) — and that E* decreases rapidly as A(v) increases. But (see (22.40)) my is 
essentially proportional to A!/?(v). Hence as my increases, non-perturbative 
behaviour sets in increasingly early. Suppose we say that we should like per- 
turbative behaviour to be maintained up to an energy scale A. Then we 
require 


4 2 1/2 
ü | (22.204) 
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For A ~ 1016 GeV, this gives my < 160 GeV. On the other hand, if the 
non-perturbative regime sets in at 1TeV, then the bound on mg is weaker, 
mg « 150 GeV. 

'This is an oversimplified argument for various reasons, though the essential 
point is correct. An important omission is the contribution of the top quark to 
the running of A(E). A more refined version (Hambye and Riesselmann 1997) 
concludes that for my « 180 GeV the perturbative regime could extend all 
the way to the Planck mass, ~ 10!? GeV. 

There is another, independent, argument which suggests that my cannot 
be too large. We have previously considered violations of unitarity by the 
lowest-order diagrams for certain processes (see chapter 21 and section 22.6). 
As we saw, in a non-gauge theory with massive vector bosons, such violations 
are associated with the longitudinal polarization states of the bosons, which 
carry factors proportional to the 4-momentum k” (see (22.18)). In a gauge 
theory, strong cancellations in the high energy behaviour occur between differ- 
ent lowest-order diagrams. This behaviour is characteristic of gauge theories 
(Llewellyn Smith 1973, Cornwall et al. 1974), and is related to their renor- 
malizability. One process of this sort which we did not yet consider, however, 
is that in which two longitudinally polarized W's scatter from each other. A 
considerable number of diagrams (7 in all) contribute to this process, in lead- 
ing order : exchange of y, Z and Higgs particles, together with the W-W self 
interaction. When all these are added up the high-energy behaviour of the 
total amplitude turns out to be proportional to A, the Higgs coupling constant 
(see for example Ellis, Stirling and Webber 1996, chapter 8). This at first sight 
unexpected result can be understood as follows. The longitudinal components 
of the W's arise from the ‘0“¢’ parts in (22.30) (compare equation (19.48) in 
the U(1) case), which produce k” factors. Thus the scattering of longitudinal 
W's is effectively the scattering of the 3 Goldstone bosons in the complex 
Higgs doublet. These bosons have self interactions arising from the Alte) 
Higgs potential, for which the Feynman amplitude is just proportional to A. 
Now, although such a constant term obviously cannot violate unitarity as the 
energy increases (as happened in the other cases), it can do so if A itself is too 
big — and since \ x m, this puts a bound on mg. A constant amplitude is 
pure J — 0 and so, in order of magnitude, we expect unitarity to imply A « 1. 
In terms of standard quantities, 


A= mi.Gr/V2, (22.205) 


and so we expect 
mg < Gg? ~ 300 GeV, (22.206) 


an energy scale we have seen several times before. A more refined analysis 
(Lee et al. 1977a,b) gives 


1/2 

8/2 

mg < V) a ITV. (22.207) 
3GF 


22.8. The Higgs sector 425 


Like the preceding argument, this one does not say that my must be less 
than some fixed number. Rather, it says that if my gets bigger than a certain 
value, perturbation theory will fail, or ‘new physics’ will enter. It is, in fact, 
curiously reminiscent of the original situation with the four-fermion current— 
current interaction itself (compare (22.10) with (22.206)). Perhaps this is a 
clue that we may eventually need to replace the Higgs phenomenology. At all 
events, this line of reasoning seems to imply that the Higgs boson will either 
be found at a mass well below 1 TeV, or else some electroweak interactions 
will become effectively strong with new physical consequences. This ‘no lose’ 
situation provided powerful motivation for the construction of the LHC. 

There is also an interesting lower bound on the Higgs mass, which is de- 
rived from the requirement of vacuum stability. If the Higgs mass is suffi- 
ciently lighter than the top quark mass, the top quark loop contribution to 
the running of the quartic coupling A(£) can cause the coupling to go nega- 
tive at large energy scales (Cabibbo et al. 1979). This would imply that, at 
such scales, the effective scalar potential of the Standard Model would be un- 
bounded below at large absolute values of the field, and there would no longer 
be a stable ground state (vacuum). This can be tolerated if the lifetime of the 
metastable vacuum is less that the age of the Universe (see Isidori et al. 2001, 
and references cited therein). A re-examination of the issue by Elias-Miro et 
al. (2012) showed that the Standard Model vacuum would become unstable 
at scales around the Planck mass, for my < 130 GeV. For my ~ 125 GeV, 
instability occurs at scales of order 10'° GeV, but the lifetime is greater than 
the age of the Universe. Of course, new physics may enter well before such a 
scale. It is nevertheless intriguing that a Higgs mass in this region may have 
implications for the physics of the early Universe. 

We now consider some simple aspects of Higgs boson production and de- 
cay processes at collider energies, as predicted by the Standard Model, and 
conclude with the experiments leading to the probable Higgs boson discovery 
in 2012. 


22.8.8 Higgs boson searches and the 2012 discovery 


We begin by considering the main production and decay modes. The existing 
lower bound on my established at LEP (LEP 2003) 


my > 114.4 GeV (95% Confidence Level) (22.208) 


already excluded many possibilities in both production and decay. Subsequent 
searches were carried out at the hadron colliders. At both the Tevatron and 
the LHC, the dominant parton-level production mechanism is ‘gluon fusion’ 
via an intermediate top quark loop as shown in figure 22.27 (Georgi et al. 
1978, Glashow et al. 1978, Stange et al. 1994a,b). The intermediate t quark 
dominates, because the Higgs couplings to fermions are proportional to the 
fermion mass. Since the gluon probability distribution rises rapidly at small 
x values, which are probed at larger collider energy y's, the cross section for 
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FIGURE 22.27 
Higgs boson production process by ‘gluon fusion’. 


FIGURE 22.28 
Higgs boson production process by ‘vector boson fusion’. 


this process (which is the same for pp and pp colliders) will rise with energy. 
At the Tevatron with ys = 1.96 TeV, the cross section ranges from about 
1 pb for my ~ 100 GeV to 0.2 pb for my ~ 200 GeV. At an LHC energy 
of \/s = 7 TeV, the cross section is about 25 pb for my ~ 100 GeV and 0.1 
pb for my ~ 700 GeV, rising to about 70 pb and 1 pb respectively at ./s = 
14 TeV (Dittmaier et al. 2011). These numbers include QCD corrections, 
which increase the parton-level cross sections by a factor of about 2. 

The next largest cross sections, some ten times smaller than the gluon 
fusion process, are for Higgs production via ‘vector boson fusion’ (qq' > qq'H, 
see figure 22.28) and for associated production of a Higgs boson with a vector 
boson (qq —^ WH, ZH, see figure 22.29). 

These processes involve the trilinear Higgs couplings to the vector bosons, 
which are proportional to their masses (see appendix Q). At the LHC, the first 
of these cross sections is somewhat larger than the second for my < 130 GeV, 
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FIGURE 22.29 
Higgs boson production in association with W or Z. 


FIGURE 22.30 
Higgs boson production in association with a t t pair. 


while the order is reversed at the Tevatron because the initial state is pp. 
A fourth production possibility, at a significantly smaller rate, is ‘associated 
production with top quarks’ as shown in figure 22.30, for example. Figure 
22.31 (taken from Ellis, Stirling and Webber 1996) shows the cross sections 
for the various production processes as a function of my, for pp collisions 
at ys = 14 TeV. Updated calculations (including QCD and electroweak 
corrections) are described in reports by Dittmaier et al. (2011, 2012), which 
present the results of a very large-scale theoretical effort. 

The Higgs boson must be detected via its decays. For my < 135 GeV, 
decays to fermion-antifermion pairs dominate, of which bb has the largest 
branching ratio because of the larger value of mp; the decay to r* r^ is roughly 
an order of magnitude smaller. The width of H > ff is easily calculated to 
lowest order and is (problem 22.11) 


3/2 
- CGrm2m 4m? 
oT DE ME ( Z x) (22.209) 


where the colour factor C is 3 for quarks and 1 for leptons. For such mg 


values, [(H — ff) is less than 5 MeV, and the total decay width is less than 
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FIGURE 22.31 

Higgs boson production cross sections in pp collisions at the LHC (figure 
from R K Ellis, W J Stirling and B R Webber QCD and Collider Physics 
1996, courtesy Cambridge University Press). 


10 MeV. QCD corrections are largely accounted for by replacing mý in the first 
factor on the right-hand side of (22.209), which arises from the Higgs-fermion 
Yukawa coupling, by the MS running mass value ms’ (mn). 

However, the large rate for the process gg — H — bb has to compete 
against a very large background from the inclusive production of pp (or pp) > 
bb-- X via the strong interaction. The Higgs signal can be separated from such 
a background by using subleading decay modes such as H > yy. The Higgs 
boson's coupling to photons is induced by quark triangle loops (figure 22.32) 
or a W loop. In a similar way, the associated production modes W+H, ZH, 
allow use of the leptonic W and Z decays to reject QCD backgrounds. 

Decays to a pair of vector bosons are also important. The tree-level width 
for H > W*W- is (problem 22.11) 


3 4M2 1/2 4M2 M4 
rH > wtw-) = EET ( DIE ) ( ET i22) | 
8/2 m 


mg H mg 

(22.210) 
and the width for H — ZZ is the same with Mw — Mz and a factor of i to 
allow for the two identical bosons in the final state. These widths rise rapidly 
with mg, reaching T ~ 1 GeV when mg ~ 200 GeV. Even for values of 
mg below the physical W*W- and ZZ thresholds, H can still decay through 
modes mediated by virtual bosons, via the off-shell decays H —^ WW* and 
H > ZZ*. 
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FIGURE 22.32 
Higgs boson decay via quark triangle. 


Figure 22.33, taken from Ellis, Stirling and Webber (1996), shows the 
complete set of phenomenologically relevant Higgs branching ratios for a Higgs 
boson with my < 200 GeV. Updated results for SM Higgs branching ratios 
are reported in Dittmaier et al. (2012). 

We turn now to the experiments. The Tevatron pp collider at Fermilab 
operated at ys = 1.96 TeV until its shutdown in 2011. Higgs searches were 
conducted by two experiments, CDF and DO, which each collected approx- 
imately 10 fb~! of data with the capability of seeing a Higgs signal in the 
mass range 90-185 GeV. The analyses searched for a Higgs boson produced 
through gluon fusion, in association with a vector boson, and through vector 
boson fusion. The decays H > bb, H > W*W-, H > ZZ, H > r+7~ and 
H — yy were all studied. 

The LHC is a pp collider at CERN which started running in 2010. The two 
general purpose detectors ATLAS (‘A Toroidal LHC ApparatuS’) and CMS 
(‘Compact Muon Solenoid’) were designed to study physics at the TeV scale, 
and in particular to search for the Higgs boson. In 2011, the LHC delivered 
to ATLAS and CMS up to 5.1 fb~! of integrated luminosity of pp collisions 
at \/s =7 TeV. In 2012 the CMS energy was increased to 8 TeV, and by July 
2012 up to 5.9 fb~! of further data was delivered. At the LHC, the main Higgs 
boson production processes are the same as at the Tevatron, but as mentioned 
above vector boson fusion is more important than production in association 
with W or Z, or with tt. The LHC experiments are sensitive to Higgs bosons 
of much higher mass than the Tevatron experiments, ranging from the LEP 
bound (22.209) up to about 600 GeV. The same decay channels were studied 
as at the Tevatron. 

By early 2012, the ATLAS and CMS experiments had excluded an my 
value in the interval 129 GeV to 539 GeV at the 95% CL, and the mass 
region 120-130 GeV was under intensive study, excesses of events having been 
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FIGURE 22.33 

Branching ratios of the Higgs boson (figure from R K Ellis, W J Stirling and 
B R Webber QCD and Collider Physics 1996, courtesy Cambridge University 
Press). 


reported by both experiments in the region 124-126 GeV (Aad et al. 2012a, 
Chatrchyan et al. 2012a). Then, on July 4, 2012, the ATLAS and CMS 
collaborations simultaneously announced the observation (at a significance 
greater than 50) of a new boson with a mass in the range 125-126 GeV and 
with properties compatible with those of a SM Higgs boson. These results 
(updated) are reported in Aad et al. (2012b) and Chatrchyan et al. (2012b). 
The crucial channels in the discovery were the decay modes H > yy and 
H — ZZ* — 4 leptons, both of which provide a high-resolution invariant 
mass for fully reconstructed candidates. The cover illustration for Volume 1 
of this book (copyright CERN) shows a candidate yy event recorded by CMS, 
and that for volume 2 (copyright CERN) shows a candidate four muon event 
recorded by ATLAS. The channel H —^ WW* — €évév is equally sensitive but 
has low resolution. The ATLAS result for the mass of the boson was (Aad et 
al. 2012b) 


126.0 + 0.4(stat) + 0.4(syst.) GeV (22.211) 


and the CMS result was (Chatrchyan et al. 2012b) 


125.3 + 0.4(stat) + 0.5(syst.) GeV. (22.212) 
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At about the same time, the CDF and DO collaborations at the Tevatron 
reported the combined results of their searches for a SM Higgs boson produced 
in association with a W or a Z boson, and subsequently decaying to a bb pair. 
The data corresponded to an integrated luminosity of 9.7 fb^!. An excess of 
events was observed in the mass range 120-135 GeV, at a significance of 30, 
which was interpreted as evidence for a new particle, consistent with the SM 
Higgs boson (Aaltonen et al. 2012). This provided the first strong indication 
for the decay of the new particle to a fermion-antifermion pair at a rate 
consistent with the SM prediction. 


Is the particle discovered by the ATLAS and CMS collaborations the Higgs 
boson of the Standard Model? The decay to two photons implies that its 
spin cannot be unity (Landau 1948, Yang 1950), but spin-0 has not yet been 
established. Even so, this already implies that the particle is different from all 
the other SM particles. The decay modes yy, ZZ*, WW* have been observed 
by ATLAS and CMS, and bb by CDF/D0. The 7*7- mode has not yet been 
seen. A measure of the compatibility of the observed boson with the SM 
Higgs boson is provided by the best-fit value of the common signal strength 
parameter u defined by 


u = 0-BR/(o-BR)sm (22.213) 


where c is the boson production cross section and BR is the branching ratio 
of the boson to the observed final state. ‘SM’ denotes the SM prediction, so 
that the value u = 1 is the SM hypothesis. ATLAS reported a best-fit jz-value 
of u = 1.4 -E 0.3 for my = 126 GeV; the p-values for the individual channels 
were all within one standard deviation (s.d.) of unity. CMS reported a best-fit 
value of u = 0.87 + 0.23 at my = 125.5 GeV, and again the individual values 
in the observed channels were within 1 s.d. of unity. The conclusion is that 
these results are consistent, within uncertainties, with the predictions for the 
SM Higgs boson. 


We end this book with a discovery which opens a new era in particle 
physics, in which the electroweak symmetry-breaking (Higgs) sector will be 
rigorously tested. The aim will be to measure the couplings of the new boson 
to the other SM particles with increasing accuracy, so as to reveal possible 
deviations from the SM values. The level of precision required to provide 
clear pointers to physics beyond the SM may be very high (see for example 
Gupta et al. 2012). The LHC will continue running until early 2013, when 
it will be shut down for machine improvements needed to allow operation at 
v/s = 14 TeV and higher luminosity; beyond that, the High Luminosity LHC 
is planned to begin data-taking in 2022. However, just as the discovery of the 
W and Z bosons at the CERN pp collider was followed by precision studies 
at the ete~ colliders LEP and SLC, a lepton collider is likely to be needed on 
the next stage of this fundamental exploration. 
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Problems 


22.1 


(a) Using the representation for a, 3 and ys introduced in section 20.2.2 
(equation (20.14)), massless particles are described by spinors of the 


form 
u = EP ( i ) (normalized to u'u = 2E) 
where ø - pds = +¢+4,p = p|p|. Find the explicit form of u for the 


case p = (sin 0,0, cos 0). 


(b) Consider the process Pa +47 — De+e7, discussed in section 22.1, in the 
limit in which all masses are neglected. The amplitude is proportional 
to 

Gp0(F.,R)yu(1 — 25)u(u7 L)ü(e", L)^ (1 — 95) 0(%, R) 


where we have explicitly indicated the appropriate helicities R or L 
(note that, as explained in section 20.2.2, (1 — 75)/2 is the projec- 
tion operator for a right-handed antineutrino). In the CM frame, 
let the initial ~~ momentum be (0,0, E) and the final e^ momen- 
tum be E(sin6,0,cos0). Verify that the amplitude is proportional to 
Gy E?(1--cos0). (Hint: evaluate the ‘easy’ part v(v,)y,(1— ys)u(p- ) 
first; this will show that the components u = 0,z vanish, so that only 
the u = x,y components of the dot product need to be calculated.) 


22.2 Verify equation (22.20). 


22.3 Check that when the polarization vector of each photon in figures 22.7(a) 
and (b) is replaced by the corresponding photon momentum, the sum of these 
two amplitudes vanishes. 


22.4 By identifying the part of (22.45) which has the form (22.57), derive 
(22.58). 


22.5 Using the vertex (22.48), verify (22.79). 
22.6 Insert (22.29) into (22.151) to derive (22.153). 


22.7 Verify that the neutral current part of (22.159) is diagonal in the ‘mass’ 
basis. 


22.8 Suppose that the Higgs field is a triplet of SU(2), rather than a doublet; 
and suppose that its vacuum value is 


1 0 
(0/g|0) = | 0 
f 
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in the gauge in which it is real. The non-vanishing component has £3 = —1, 
using 
10 0 
h= | 0 0 0 
0 0 -1 


in the ‘angular-momentum-like’ basis. Since we want the charge of the vacuum 
to be zero, and we have Q = ts + y/2, we must assign y(@) = 2. So the 
covariant derivative on ¢ is 


(Oy, + igt - W, 4 ig’ B,)ó, 


where A ^ 
0 Z 0 0 5 0 
t=] 0 = MM MEET ee 
BO oie. Tu Aapa c. vA 
0 E 0 0 v 0 


and tz is as above (it is easy to check that these three matrices do satisfy the 
required SU(2) commutation relations [t1,t2] = it3). Show that the photon 
and Z fields are still given by (22.36) and (22.37), with the same sin Ow as in 
(22.39), but that now 


Mz = V2Mw / cos Ow. 
What is the value of the parameter p in this model? 
22.9 Use (22.188) to verify (22.190). 
22.10 Calculate the lifetime of the top quark to decay via t —^ W* + b. 


22.12 Using the Higgs couplings given in appendix Q, verify (22.209) and 
(22.210). 
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Group Theory 


M.1 Definition and simple examples 


A group G is a set of elements (a,b, c,...) with a law for combining any two 
elements a, b so as to form their ordered ‘product’ ab, such that the following 
four conditions hold: 


(i) For every a,b € G, the product ab € G (the symbol ‘€’ means ‘belongs 
to’, or ‘is a member of"). 


(ii) The law of combination is associative, i.e. 


(ab)c = a(bc). (M.1) 


(iii) G contains a unique identity element, e, such that for all a € G, 


ae — ea — a. (M.2) 


(iv) For all a € G, there is a unique inverse element, a~!, such that 
gu Weg. (M.3) 


Note that in general the law of combination is not commutative — i.e. 
ab # ba; if it is commutative, the group is Abelian; if not, it is non-Abelian. 
Any finite set of elements satisfying the conditions (i)- (iv) forms a finite group, 
the order of the group being equal to the number of elements in the set. If 
the set does not have a finite number of elements it is an infinite group. 

As a simple example, the set of four numbers (1, i, -1, —i) form a finite 
Abelian group of order 4, with the law of combination being ordinary multi- 
plication. The reader may check that each of (i)- (iv) is satisfied, with e taken 
to be the number 1, and the inverse being the algebraic reciprocal. A second 
group of order 4 is provided by the matrices 


CRG eS E i 


with the combination law being matrix multiplication, ‘e’ being the first (unit) 
matrix, and the inverse being the usual matrix inverse. Although matrix 
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multiplication is not commutative in general, it happens to be so for these 
particular matrices. In fact, the way these four matrices multiply together 
is (as the reader can verify) exactly the same as the way the four numbers 
(1, i, —1, —i) (in that order) do. Further, the correspondence between the 
elements of the two groups is ‘one to one’: that is, if we label the two sets 
of group elements by (e, a, b, c) and (e', a/, t, c"), we have the correspondences 
e & e, a & a', b&b, c & c. Two groups with the same multiplication 
structure, and with a one-to-one correspondence between their elements, are 
said to be isomorphic. If they have the same multiplication structure but the 
correspondence is not one-to-one, they are homomorphic. 


NENNEN." , | . — 
M.2 Lie groups 


We are interested in continuous groups — that is, groups whose elements are 
labelled by a number of continuously variable real parameters a1, @2,...,Qp : 
g(01,02,...,0,) = g(a). In particular, we are concerned with various kinds of 
‘coordinate transformations’ (not necessarily space-time ones, but including 
also ‘internal’ transformations such as those of SU(3)). For example, rotations 
in three dimensions form a group, whose elements are specified by three real 
parameters (e.g. two for defining the axis of the rotation, and one for the angle 
of rotation about that axis). Lorentz transformations also form a group, this 
time with six real parameters (three for 3-D rotations, three for pure velocity 
transformations). The matrices of SU(3) are specified by the values of eight 
real parameters. By convention, parametrizations are arranged in such a 
way that g(0) is the identity element of the group. For a continuous group, 
condition (i) takes the form 


g(o)g(B) = 9(y(@, B)), (M.5) 


where the parameters ^y are continuous functions of the parameters « and @. 
A more restrictive condition is that y should be an analytic function of a and 
6B; if this is the case, the group is a Lie group. 

The analyticity condition implies that if we are given the form of the 
group elements in the neighbourhood of any one element, we can ‘move out’ 
from that neighbourhood to other nearby elements, using the mathematical 
procedure known as 'analytic continuation' (essentially, using a power series 
expansion); by repeating the process, we should be able to reach all group 
elements which are ‘continuously connected’ to the original element. The 
simplest group element to consider is the identity, which we shall now denote 
by I. Lie proved that the properties of the elements of a Lie group which can 
be reached continuously from the identity I are determined from elements 
lying in the neighbourhood of T. 
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M.3 Generators of Lie groups 
Consider (following Lichtenberg 1970, chapter 5) a group of transformations 
defined by 

x, = fi(£1, en es Oly O9,» -, Qr), (M.6) 
where the z;'s (i = 1,2,..., N) are the ‘coordinates’ on which the transfor- 


mations act, and the a’s are the (real) parameters of the transformations. By 
convention, a = 0 is the identity transformation, so 


A transformation in the neighbourhood of the identity is then given by 


T a i 
dx; = x daw, (M.8) 
v=1 


where the {da,} are infinitesimal parameters, and the partial derivative is 
understood to be evaluated at the point (æ, 0). 

Consider now the change in a function F(a) under the infinitesimal trans- 
formation (M.8). We have 


FoF+dF = P+ nan, 
Ti 
i=l 
N T 
0f, OF 
= F+ > J dav Tm 
i=l Lv=1 
= {1-) dœ if, }F, (M.9) 
v=1 
where 
! N Of; ð 
SES : M.1 
22 Bou, Os; 020) 


is a generator of infinitesimal transformations!. Note that in (M.10) v runs 
from 1 to r, so there are as many generators as there are parameters labelling 
the group elements. Finite transformations are obtained by ‘exponentiating’ 
the quantity in braces in (M.9) (compare (12.30)): 


U(a) = exp{—ia- X], (M.11) 


1Clearly there is lot of ‘convention’ (the sign, the i) in the definition of X,. It is chosen 
for convenient consistency with familiar generators, for example those of SO(3) (see section 
M.4.1). 
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where we have written 37^ _, aX, =a- X. 
An important theorem states that the commutator of any two generators 
of a Lie group is a linear combination of the generators: 


PST LT eer (M.12) 


where the constants ck p aTe complex numbers called the structure constants 
of the group; a sum over v from 1 to r is understood on the right-hand side. 
The commutation relations (M.12) are called the algebra of the group. 


= 
M.4 Examples 
M.4.1 SO(3) and three-dimensional rotations 
Rotations in three dimensions are defined by 
x' = Rg, (M.13) 


where R is a real 3 x 3 matrix such that the length of x is preserved, i.e. 
x'T a’ = aT x. This implies that RTR = I, so that R is an orthogonal matrix. 
It follows that 


1 = det( RTR) = detR"detR = (detR)?, (M.14) 


and so detR = +1. Those R’s with detR = —1 include a parity transformation 
(z' = —a), which is not continuously connected to the identity. Those with 
detR = 1 are ‘proper rotations’, and they form the elements of the group 
SO(3): the Special Orthogonal group in 3 dimensions. 

An R close to the identity matrix J can be written as R = I + ôR where 


(I -óR)T(I +R) =T. (M.15) 
Expanding this out to first order in dR gives 
óRT = —óR, (M.16) 


so that OR is an antisymmetric 3 x 3 matrix (compare (12.19)). We may 
parametrize ôR as 


0 €3 —€2 
ôR = —€3 0 €1 , (M.17) 
€2 736] 0 


and an infinitesimal rotation is then given by 


z'—mc-—exm, (M.18) 
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(compare (12.64)), or 
dx, = —€2%3 + €312, dz» = —€311 + €1%3, dz3 = —€142 + E21. (M.19) 


Thus in (M.8), identifying do; = «1, do» = e», dag = es, we have 


Of, — Of, _ Of, _ 
Bn 0, oim T3, qum x2, etc. (M.20) 


The generators (M.10) are then 


V = 3 [] : ə 

X = 173 Dra — 1T2 Ora 

X» = iris. P 123 Oa, (M.21) 
V : [7] : [] 

X3 = 122 Ər, =, 1V1 prz 


which are easily recognized as the quantum-mechanical angular momentum 
operators . 
X=2~x -iV, (M.22) 


which satisfy the SO(3) algebra 
[X;, Xj] = iei Xp. (M.23) 


The action of finite rotations, parametrized by a = (a1, @2, a3), on functions 
F is given by , , 

U(ao) = exp(-ia - X]. (M.24) 
The operators Ü (aœ) form a group which is isomorphic to SO(3). The structure 
constants of SO(3) are ie;;;, from (M.23). 


M.4.2 SU(2) 


We write the infinitesimal SU(2) transformation (acting on a general complex 
two-component column vector) as (cf (12.27)) 


( qi ) = (1+ie-7/2)( in j (M.25) 


q q2 
so that 
1€3 ie, €2 
dq = Ex + E + 2) q2 
—ie€3 ley €2 
= ——— : M.2 
dq» 7 qo + ( 2 2 ) qı ( 6) 


Then (with da; = ei etc.) 
ig Of: qg fi _ in 


9fi 
Pune qr SEA LI ILLI 2 
Qoi 2 : Jag 2 i 0a3 2 : (M 7) 
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0n 2? ða 3.009 2? on 
and (from (M.10)) 
ĝi =-5 {age tage} (M.29) 
Å, = i lex - T (M.30) 
Bifes tag}, (M.31) 


It is an interesting exercise to check that the commutation relations of the 
X/’s are exactly the same as those of the X;'s in (M.23). The two groups are 
therefore said to have the same algebra, with the same structure constants, 
and they are in fact isomorphic in the vicinity of their respective identity 
elements. They are not the same for ‘large’ transformations, however, as we 
discuss in section M.7. 


M.4.3 SO(4): The special orthogonal group in four 
dimensions 


This is the group whose elements are 4 x 4 matrices S such that STS = J, 
where I is the 4x 4 unit matrix, with the condition detS = +1. The Euclidean 
(length)? x? + 23 + z2 + 27 is left invariant under SO(4) transformations. 
Infinitesimal SO(4) transformations are characterized by the 4-D analogue 
of those for SO(3), namely by 4 x 4 real antisymmetric matrices ôS, which 
have 6 real parameters. We choose to parametrize 6S in such a way that the 
Euclidean 4-vector (a, x4) is transformed to (cf (18.76) and (18.77)) 


av = yz—exmc—mgm, 


T4 = wt+n-a, (M.32) 
where x = (#1, 22,23) and 7 = (m, 72,73). Note that the first three compo- 


nents transform by (M.18) when 7 = 0, so that SO(3) is a subgroup of SO(4). 
The six generators are (with da, = e etc.) 


X1 Lp Ue e 


— iz3 —— M. 
B; Sg (M.33) 


and similarly for X5 and X3 as in (M.21), together with (defining da4 = m 
etc.) 


5 , o o 
X4 = 1 (as T X31 Z) (M.34) 
5 ; o o 
Xs = 1 Gao Ic 23 (M.35) 
Xe = i (as a 23 . (M.36) 
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Relabelling these last three generators as = Š, y= X yy vc Še, we 
find the following algebra: 


[Xi, X5] = iei Xx (M.37) 
[Xi, Yj] = iei Y (M.38) 
[xe iei Xk, (M.39) 
together with er "a "n 
[X1, Yi] = [X2, Yo] = [X3, Y3] = 0. (M.40) 


(M.37) confirms that the three generators controlling infinitesimal transfor- 
mations among the first three components x obey the angular momentum 
commutation relations. (M.37)-(M.40) constitute the algebra of SO(4). 

This algebra may be simplified by introducing the linear combinations 


Toa 

M = 3C t i) (M.41) 

" Tero g 
Ni = z% Y) (M.42) 

which satisfy M . 
[Mi, Mj] = iei Mk (M.43) 
[N;, Nj] = iei Nx (M.44) 
[M;, Nj] = 0. (M.45) 


From (M.43)-(M.45) we see that, in this form, the six generators have sep- 
arated into two sets of three, each set obeying the algebra of SO(3) (or of 
SU(2)), and commuting with the other set. They therefore behave like two 
independent angular momentum operators. The algebra (M.43)-(M.45) is re- 
ferred to as SU(2) x SU(2). 


M.4.4 The Lorentz group 


In this case the quadratic form left invariant by the transformation is the 
Minkowskian one (x°)? — a? (see appendix D of volume 1). We may think of 
infinitesimal Lorentz transformations as corresponding physically to ordinary 
infinitesimal 3-D rotations, together with infinitesimal pure velocity transfor- 
mations (‘boosts’). The basic 4-vector then transforms by 

xr 


rP- Nx 
ve = y—exm-ma? (M.46) 


II 


where 7) is now the infinitesimal velocity parameter (the reader may check 
that (x°)? — æ? is indeed left invariant by (M.46), to first order in € and n). 
The six generators are then X1, X2, X3 as in (M.21), together with 


^ f o à 
Kı = -i Ge + x) (M.47) 
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Ky =-i Ge + xx) (M.48) 
E; = -i (os + xs) (M.49) 
The corresponding algebra is 
[X;, Xj = iei Xy (M.50) 
[Xi, Kj] = iei Kx (M.51) 
[Ki, Kj] = —ieijn Xe. (M.52) 


Note the minus sign on the right-hand side of (M.52) as compared with (M.39). 


M.4.5 SU(3) 


A general infinitesimal SU(3) transformation may be written as (cf (12.71) 
and (12.72)) 


1 


qı 1 qı 
q3 q3 
where there are now 8 of these ms, n = (71,72,---,7g), and the A-matrices 


are the Gell-Mann matrices 


010 0 —i 0 1 0 0 
M={ 100], X=ļ|i 0 O},rA=] O -1 0 | M54 
00 0 0 0 0 0 0 
001 0 0 -i 000 
42[000],A2[00 0 ],r+4942={ 0 O 1 (M.55) 
100 i00 010 
0 0 0 A 0 0 
xX-|[00-i]|,-| 0 3 0 (M.56) 
1 2 
0 i 0 0 0 -3 


In this parametrization the first three of the eight generators G, (r =1,2,...,8) 
are the same as X1, X5, X5 of (M.29)-(M.30). The others may be constructed 
as usual from (M.10); for example, 


z i [;) o A i o o 
Gum lee -qa ), GS ta -p |. M.57 
575 (o ðq qı x) (E (o qa q2 x) ( ) 
The SU(3) algebra is found to be 


[G., Gi] m ifabcĜc, (M.58) 


M.5. Matrix representations of generators, and of Lie groups 443 


where a,b and c each run from 1 to 8. The structure constants are ifabc, and 
the non-vanishing f’s are as follows: 


fi23 = 1, fiar = 1/2, fise = —1/2, fz46 = 1/2, f257 = 1/2 (M.59) 


fas = 1/2, fae: = —1/2, fass = V3/2, fers = V3/2. (M.60) 


Note that the f’s are antisymmetric in all pairs of indices (Carruthers (1966) 
chapter 2). 


M.5 Matrix representations of generators, and of Lie 
groups 


We have shown how the generators Xi X3, Js CX of a Lie group can be con- 
structed as differential operators, understood to be acting on functions of the 
‘coordinates’ to which the transformations of the group refer. These genera- 
tors satisfy certain commutation relations, the Lie algebra of the group. For 
any given Lie algebra, it is also possible to find sets of matrices X1, X5,..., X, 
(without hats) which satisfy the same commutation relations as the X,'s — 
that is, they have the same algebra. Such matrices are said to form a (ma- 
trix) representation of the Lie algebra, or equivalently of the generators. The 
idea is familiar from the study of angular momentum in quantum mechanics 
(Schiff 1968, section 27), where the entire theory may be developed from the 
commutation relations (with h = 1) 


[Ji Jj] = icije Jh (M.61) 


for the angular momentum operators Jo together with the physical require- 
ment that the J;'s (and the matrices representing them) must be Hermitian. 
In this case the matrices are of the form (in quantum-mechanical notation) 


(J) = ABA 
(4 jas = UM5|ÁIJM;), (M.62) 


a 


where |J Mj) is an eigenstate of J^ and of J3 with eigenvalues J(J +1) and 
M ; respectively. Since M; and M’, each run over the 2J 4- 1 values defined by 
—J < M;, M} < J, the matrices I? are of dimension (2J + 1) x (2J + 1). 
Clearly, since the generators of SU(2) have the same algebra as (M.61), an 
identical matrix representation may be obtained for them; these matrices were 
denoted by TD in section 12.1.2. It is important to note that J (or T) 
can take an infinite sequence of values J = 0,1/2,1,3/2,..., corresponding 
physically to various ‘spin’ magnitudes. Thus there are infinitely many sets 
of three matrices (I, ID, go all with the same commutation relations as 
(M.61). 
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A similar method for obtaining matrix representations of Lie algebras may 
be followed in other cases. In physical terms, the problem amounts to finding 
a correct labelling of the base states, analogous to |J M). In the latter case, 
the quantum number J specifies each different representation. The reason 


42 
it does so is because (as should be familiar) the corresponding operator J 
commutes with every generator: 


[2 24] es: (M.63) 


Such an operator is called a Casimir operator, and by a lemma due to Schur 
(Hammermesh 1962, pages 100—101) it must be a multiple of the unit operator. 
'The numerical value it has is different for each different representation, and 
may therefore be used to characterize a representation (namely as ‘J = 0’, 
‘J = 1/2, etc.). 

In general, more than one such operator is needed to characterize a repre- 
sentation completely. For example, in SO(4), the two operators M and N 3 
commute with all the generators, and take values M(M +1) and N(N +1) 
respectively, where M,N = 0,1/2,1,.... Thus the labelling of the matrix 
elements of the generators is the same as it would be for two independent par- 
ticles, one of spin M and the other of spin N. For given M, N the matrices 
are of dimension [(2M -- 1) 4- (2N 4- 1)] x [(2M+1)+(2N+1)]. The number of 
Casimir operators required to characterize a representation is called the rank 
of the group (or the algebra). This is also equal to the number of independent 
mutually commuting generators (though this is by no means obvious). Thus 
SO(4) is a rank two group, with two commuting generators Ms and Na; so 
is SU(3), since G4 and Gg commute. Two Casimir operators are therefore 
required to characterize the representations of SU(3), which may be taken to 
be the ‘quadratic’ one 


CBC re ide (M.64) 
together with a ‘cubic’ one 
C3 = dac GGG. (M.65) 


where the coefficients dabc are defined by the relation 
4 
{Aa Ab} = gabl + 2dareAc, (M.66) 


and are symmetric in all pairs of indices (they are tabulated in Carruthers 
1966, table 2.1). In practice, for the few SU(3) representations that are ac- 
tually required, it is more common to denote them (as we have in the text) 
by their dimensionality, which for the cases 1 (singlet), 3 (triplet), 3" (an- 
titriplet), 8 (octet) and 10 (decuplet) is in fact a unique labelling. The values 
of C2 in these representations are 


Ó5(1) =0, C2(3, 3*) = 4/3, C2(8) = 3, C2(10) = 6. (M.67) 
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Having characterized a given representation by the eigenvalues of the 
Casimir operator(s), a further labelling is then required to characterize the 
states within a given representation (the analogue of the eigenvalue of Ja for 
angular momentum). For SO(4) these further labels may be taken to be the 
eigenvalues of M and Na; for SU(3) they are the eigenvalues of G3 and Gs - 
ie. those corresponding to the third component of isospin and hypercharge, 
in the flavour case (see figures 12.3 and 12.4). 

In the case of groups whose elements are themselves matrices, such as 
SO(3), SO(4), SU(2), SU(3), and the Lorentz group, one particular represen- 
tation of the generators may always be obtained by considering the general 
form of a matrix in the group which is infinitesimally close to the unit element. 
In a suitable parametrization, we may write such a matrix as 


1+iS e X(9, (M.68) 
v=1 
where (€1, €2,...,€,-) are infinitesimal parameters, and (XP x, 23s X 


are matrices representing the generators of the (matrix) group G. This is 
exactly the same procedure we followed for SU(2) in section 12.1.1, where 
we found from (12.26) that the three XTS were just 7/2, satisfying the 
SU(2) algebra. Similarly, in section 12.2 we saw that the eight SU(3) X08» 
were just A/2, satisfying the SU(3) algebra. These particular two represen- 
tations are called the fundamental representations of the SU(2) and SU(3) 
algebras, respectively; they are the representations of lowest dimensionality. 
For SO(3), the three X9 )s are (from (M.17)) 


0.0 0 

XEON = |0 0 -i 

0 i 0 

0 0 i 

Xp = |0 00 

—i 0 0 

0 -i 0 
x6990 =~ | i 0 0 (M.69) 

0 0 0 

which are the same as the 3 x 3 matrices qT) of (12.48): 
(1?) ory (M.70) 
jk 


The matrices 7;/2 and po correspond to the values J — 1/2, J — 1, respec- 
tively, in angular momentum terms. 

It is not a coincidence that the coefficients on the right-hand side of (M.70) 
are (minus) the SO(3) structure constants. One representation of a Lie algebra 
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is always provided by a set of matrices {X m) whose elements are defined by 
(XM) --«,. (M.71) 

Av 


where the c’s are the structure constants of (M.12), and each of ji, v, A runs 
from 1 to r. Thus these matrices are of dimensionality r x r, where r is the 
number of generators. That this prescription works is due to the fact that the 
generators satisfy the Jacobi identity 


[AX e Xe T Ra Ro, a]l SEC Disc aT] = 0. (M.72) 


Using (M.12) to evaluate the commutators, and the fact that the generators 
are independent, we obtain 


e + e. + duo. =0. (M.73) 


The reader may fill in the steps leading from here to the desired result: 


ox), ag Dg (x), me (Pg 0879 


(M.74) is precisely the (v8) matrix element of 
R a 
[xX x00] = e. XP), (M.75) 


Q 

showing that the X Ps satisfy the group algebra (M.12), as required. The 
representation in which the generators are represented by (minus) the struc- 
ture constants, in the sense of (M.71), is called the regular or adjoint repre- 
sentation. 

Having obtained any particular matrix representation X (P) of the genera- 
tors of a group G, a corresponding matrix representation of the group elements 
can be obtained by exponentiation, via 


DP) (a) = exp{ia - XO}, (M.76) 


where œ = (@1,@2,...,@r) (see (12.31) and (12.49) for SU(2), and (12.74) 
and (12.81) for SU(3)). In the case of the groups whose elements are matrices, 
exponentiating the generators X 9) just recreates the general matrices of the 
group, so we may call this the ‘self-representation’: the one in which the group 
elements are represented by themselves. In the more general case (M.76), the 
crucial property of the matrices D? (a) is that they obey the same group 
combination law as the elements of the group G they are representing: that 
is, if the group elements obey 


g(o)g(B) = g(y(@, B)), (M.77) 


P) 


then 

D® (a) DP (8) = DP (ya, 8)). (M.78) 
It is a rather remarkable fact that there are certain, say, 10 x 10 matrices 
which multiply together in exactly the same way as the rotation matrices of 


SO(3). 
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M.6 The Lorentz group 


Consideration of matrix representations of the Lorentz group provides insight 
into the equations of relativistic quantum mechanics, for example the Dirac 
equation. Consider the infinitesimal Lorentz transformation (M.46). The 4x4 
matrix corresponding to this may be written in the form 


1+ie XO 247 KUO, (M.79) 
where 
000 0 
x19 = : : : ‘ ete, (M.80) 
00 i 0 


(as in (M.69) but with an extra border of 0’s), and 


o 
l 
E 


Re EP —1 


[e] 
ooo 
TO nO Q 


—i 


E E e O O 


(M.81) 


cOcocc cocco 


0 
0 
0 
0 —i 
0 
0 
0 


In (M.80) and (M.81) the matrices are understood to be acting on the four- 
component vector 


0 
1 
5 (M.82) 
3 


It is straightforward to check that the matrices KS) and KUO satisfy the 
algebra (M.50)-(M.52) as expected. 

An important point to note is that the matrices gen 
Pees or eC). and to the corresponding matrices of SU(2) and SU(3), 
are not Hermitian. A theorem states that only the generators of compact 
Lie groups can be represented by finite-dimensional Hermitian matrices. Here 
‘compact’ means that the domain of variation of all the parameters is bounded 
(none exceeds a given positive number p in absolute magnitude) and closed 


, in contrast to 
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(the limit of every convergent sequence of points in the set also lies in the set). 
For the Lorentz group, the limiting velocity c is not included (the y-factor goes 
to infinity), and so the group is non-compact. 

In a general representation of the Lorentz group, the generators X;, K; 
will obey the algebra (M.50)-(M.52). Let us introduce the combinations 


P= s(x TiK) (M.83) 
= ix — iK). (M.84) 
Then the algebra becomes 
[P;, P5] = ice Pk (M.85) 
Qi, Qj] = iei Qx (M.86) 
[Pi Q;] — 0, (M.87) 


which are apparently the same as (M.43)-(M.45). We can see from (M.81) 
that the matrices iK 9) are Hermitian, and the same is in fact true in a 
general finite-dimensional representation. So we can appropriate standard 
angular momentum theory to set up the representations of the algebra of 
the P’s and Q's — namely, they behave just like two independent (mutually 
commuting) angular momenta. The eigenvalues of P? are of the form P(P+1), 
for P = 0,1/2,..., and similarly for Q?; the eigenvalues of Pz are Mp where 
—P < Mp < P, and similarly for Qs. 

Consider the particular case where the eigenvalue of Q? is zero (Q = 
0), and the value of P is 1/2. The first condition implies that the Q’s are 
identically zero, so that 

X =iKk (M.88) 


in this representation, while the second condition tells us that 
1 . 1 


the familiar matrices for spin-1/2. We label this representation by the values 
of P (1/2) and Q (0) (these are the eigenvalues of the two Casimir operators). 
Then using (M.88) and (M.89) we find 


- 39 (M.90) 


and : 

KOO = -50 (M.91) 
Now recall that the general infinitesimal Lorentz transformation has the 

form 


l+ie-X —ig- K. (M.92) 


M.6. The Lorentz group 449 
In the present case this becomes 
1l+ie-o/2—n-o0/2. (M.93) 


These matrices are of dimension 2 x 2, and act on two-component spinors, 
which therefore transform under an infinitesimal Lorentz transformation by 
(cf (4.19) and (4.42)) 


g =(1+ie-0/2—n-o/2)¢. (M.94) 


We say that ¢ ‘transforms as the (1/2, 0) representation of the Lorentz group’. 
The ‘1+ie-o/2’ part is the familiar (infinitesimal) rotation matrix for spinors, 
first met in section 4.4; it exponentiates to give exp(io - 0/2) for finite rota- 
tions. The ‘—1- 0/2’ part shows how such a spinor transforms under a pure 
(infinitesimal) velocity transformation. The finite transformation law is 


Qd! = exp(-9 - a /2)¢ (M.95) 


where the three real parameters 0 = (01, V2, V3) specify the direction and 
magnitude of the boost. 

There is, however, a second two-dimensional representation, which is char- 
acterized by the labelling P = 0,Q = 1/2, which we denote by (0, 1/2). In 
this case, the previous steps yield 


X29) = o (M.96) 


as before, but : 
KO? = 50 (M.97) 


So the corresponding two-component spinor x transforms by (cf (4.19) and 
(4.42)) 
x = (1+ie-o/2+-o/2)x. (M.98) 


We see that ¢ and x behave the same under rotations, but ‘oppositely’ under 
boosts. 

These transformation laws are exactly what we used in section 4.1.2 when 
discussing the behaviour of the Dirac wavefunction w under Lorentz transfor- 
mations, where w is put together from one ¢ and one x via 


ie ( : | (M.99) 


and describes a massive spin-1/2 particle according to the equations 


E¢ = o : pọ + mx 
Ex = —øo - px + mọ, (M.100) 


consistent with the representation (3.40) of the Dirac matrices. 
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M.7 The relation between SU(2) and SO(3) 


We have seen (sections M.4.1 and M.4.2) that the algebras of these two groups 
are identical. So the groups are isomorphic in the vicinity of their respective 
identity elements. Furthermore, matrix representations of one algebra auto- 
matically provide representations of the other. Since exponentiating these 
infinitesimal matrix transformations produces matrices representing group 
elements corresponding to finite transformations in both cases, it might ap- 
pear that the groups are fully isomorphic. But actually they are not, as we 
shall now discuss. 

We begin by re-considering the parameters used to characterize elements 
of SO(3) and SU(2). A general 3-D rotation is described by the SO(3) matrix 
R(n, 0), where 7 is the axis of the rotation and 0 is the angle of rotation. For 
example, 

cos@ sinf 0 
R(2,0)= | —sin0 cos0 0 |. (M.101) 
0 0 1 


On the other hand, we can write the general SU(2) matrix V in the form 
a b 
V= ( oe 3n ) (M.102) 


where |a|?+|b|? = 1 from the unit determinant condition. It therefore depends 
on three real parameters, the choice of which we are now going to examine 
in more detail than previously. In (12.32) we wrote V as exp(ia - 7/2), 
which certainly involves three real parameters a1, @2,a3; and below (12.35) 
we proposed, further, to write œ = 90, where 0 is an angle and n is a unit 
vector. Then, since (as the reader may verify) 


exp(i0r - 1/2) = cos 0/2 + ir - fv sin 0/2, (M.103) 
it follows that this latter parametrization corresponds to writing, in (M.102), 
a = cos0/2 + in; sin0/2, b= (ny +inz) sin 0/2, (M.104) 


with n7 +n? -- n2 = 1. Clearly the condition |a|? + |b|? = 1 is satisfied, and 
one can convince oneself that the full range of a and b is covered if 0/2 lies 
between 0 and 7 (in particular, it is not necessary to extend the range of 0/2 
so as to include the interval 7 to 27, since the corresponding region of a,b can 
be covered by changing the orientation of n, which has not been constrained 
in any way). It follows that the parameters a satisfy a? < 47?; that is, the 
space of the a’s is the interior, and surface, of a sphere of radius 27, as shown 
in figure M.1. 

What about the parameter space of SO(3)? In this case, the same param- 
eters n and 0 specify a rotation, but now 0 (rather than 0/2) runs from 0 to 7. 
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FIGURE M.1 
The parameter spaces of SO(3) and SU(2): the whole sphere is the parameter 
space of SU(2), the upper (stippled) hemisphere that of SO(3). 


However, we may allow the range of 0 to extend to 27, by taking advantage 
of the fact that 
R(f« + 0) = R(—n,0). (M.105) 


Thus if we agree to limit À to directions in the upper hemisphere of figure M.1, 
for 3-D rotations, we can say that the whole sphere represents the parameter 
space of SU(2), but that of SO(3) is provided by the upper half only. 

Now let us consider the correspondence - or mapping — between the ma- 
trices of SO(3) and SU(2): we want to see if it is one-to-one. The notation 
strongly suggests that the matrix V (f 0) = exp(iófs - 7/2) of SU(2) corre- 
sponds to the matrix R(f,@) of SO(3), but the way it actually works has a 
subtlety. 

We form the quantity x - T, and assert that 


z'r-—V(fÓ0)r.TV(h,0), (M.106) 


where xv’ = R(ñ,0)x. We can easily verify (M.106) for the special case R(2, 0), 
using (M.101); the general case follows with more labour (but the general 
infinitesimal case should by now be a familiar manipulation). (M.106) estab- 
lishes a precise mapping between the elements of SU(2) and those of SO(3), 
but it is not one-to-one (i.e. not an isomorphism), since plainly V can always 
be replaced by —V and 2’ will be unchanged, and hence so will the associated 
SO(3) matrix R(ñ, 0). It is therefore a homomorphism. 

Next, we prove a little theorem to the effect that the identity element e 
of a group G must be represented by the unit matrix of the representation: 
D(e) = I. For, let D(a), D(e) represent the elements a, e of G. Then D(ae) = 
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D(a)D(e) by the fundamental property (M.78) of representation matrices. On 
the other hand, ae = a by the property of e. So we have D(a) = D(a)D(e), 
and hence D(e) = I. 

Now let us return to the correspondence between SU(2) and SO(3). V (ñ, 0) 
corresponds to R(n,6), but can an SU(2) matrix be said to provide a valid 
representation of SO(3)? Consider the case V (f; = 2,0 = 27). From (M.103) 


this is equal to 
-] 0 
TENE (M.107) 


but the corresponding rotation matrix, from (M.101), is the identity matrix. 
Hence our theorem is violated, since (M.107) is plainly not the identity matrix 
of SU(2). Thus the SU(2) matrices can not be said to represent rotations, in 
the strict sense. Nevertheless, spin-1/2 particles certainly do exist, so Nature 
appears to make use of these ‘not quite’ representations! The SU(2) identity 
element is V (ñ = 2,0 = 47), confirming that the rotational properties of a 
spinor are quite other than those of a classical object. 
In fact, two and only two distinct elements of SU(2), namely 


é i) and E b i. (M.108) 


correspond to the identity element of SO(3) in the correspondence (M.106) — 
just as, in general, V and —V correspond to the same SO(3) element R(n, 0), 
as we saw. The failure to be a true representation is localized simply to a sign: 
we may indeed say that, up to a sign, SU(2) matrices provide a representation 
of SO(3). If we ‘factor out’ this sign, the groups are isomorphic. A more 
mathematically precise way of saying this is given in Jones (1990, chapter 8). 


N 


Geometrical Aspects of Gauge Fields 


N.1 Covariant derivatives and coordinate 
transformations 


Let us go back to the U(1) case, equations (13.4)-(13.7). There, the intro- 
duction of the (gauge) covariant derivative D^ produced an object, D'(x), 
which transformed like y(x) under local U(1) phase transformations, unlike 
the ordinary derivative O^w(r) which acquired an ‘extra’ piece when trans- 
formed. This followed from simple calculus, of course — but there is a slightly 
different way of thinking about it. The derivative involves not only v(x) at 
the point x, but also w at the infinitesimally close, but different, point x + dz; 
and the transformation law of y(x) involves a(x), while that of y(x + dx) 
would involve the different function a(x + dr). Thus we may perhaps expect 
something to ‘go wrong’ with the transformation law for the gradient. 

To bring out the geometrical analogy we are seeking, let us write « = 
wr + iv; and a(x) = qx(z) so that (13.3) becomes (cf (2.64)) 


Up(z) = cosa(r)WR(x) — sino(z)vr(z) 
(N.1) 
vi(r) = sino(x)Vvm(x) + cos o(z)vy (a). 


If we think of p(x) and vr(x) as being the components of a ‘vector’ (x) along 
the eg and ej axes, respectively, then (N.1) would represent the components 
of w(x) as referred to new axes €g and ej, which have been rotated by —a(z) 
about an axis in the direction eg x ej (i.e. normal to the &g —ei plane), as shown 
in figure N.1. Other such ‘vectors’ $i (x), 62(a),... (ie. other wavefunctions 
for particles of the same charge q) when evaluated at the same point x will 
have ‘components’ transforming the same as (N.1) under the axis rotation 
ER, EI — ER, êr. But the components of the vector wa + dz) will behave 
differently. The transformation law (N.1) when written at x + dz will involve 
a(x-4-dx), which (to first order in dx) is a(x)+0,a(x)da". Thus for Wp (x4-dz) 
and vi(r + dr) the rotation angle is a(x) + ð a(x)dz” rather than a(x). 
Now comes the key step in the analogy: we may think of the additional 
angle O,a(z)dz" as coming about because, in going from x to x + dz, the 
coordinate basis vectors čr and e; have been rotated through +0,,a(x)da" 
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FIGURE N.1 
Geometrical analogy for a U(1) gauge transformation. 


(see figure N.2)! But that would mean that our ‘naive’ approach to rotations 
of the derivative of u(x) amounts to using one set of axes at x, and another 
at x + dz, which is likely to lead to ‘trouble’. Consider now an elementary 
example (from Schutz 1988, chapter 5) where just this kind of problem arises, 
namely the use of polar coordinate basis vectors €, and ee, which point in the 
r and @ directions respectively. We have, as usual, 


z—rcosÜ, y=rsing (N.2) 
and in a (real!) Cartesian basis dr' is given by 
dr = dx i+ dy j. (N.3) 
Using (N.2) in (N.3) we find 
df =  (drcos0 — rsin d0)i + (drsin@ + r cos0 d0)j 
= dr 8&4 d6 & (N.4) 
where x 
€, = cos ĝi + sinb j, €g = —rsin0 i 4- r cos0 j. (N.5) 


Plainly, €. and eg change direction (and even magnitude, for Z9) as we move 
about in the z — y plane, as shown in figure N.2. So at each point (r,0) we 
have different axes €r, Eg. 
Now suppose that we wish to describe a vector field V in terms of & and 
Eo via B 
V = V", + V°& = Ve, (sum on a = r, 6), (N.6) 
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FIGURE N.2 
Changes in the basis vectors €, and € of polar coordinates. 


and that we are also interested in the derivatives of V, in this basis. Let us 
calculate X. for example, by brute force: 

jd 
ove OE, 


aV av" ee o 085 
On Betti ee eet DM e (N.7) 


E 


where we have included the derivatives of € and e$ to allow for the fact that 
these vectors are not constant. From (N.5) we easily find 


oë, Ey 


> > 1 
ds ^9 Tas — — sinfi + cos j = —ee, (N.8) 


which allows the last two terms in (N.7) to be evaluted. Similarly, we can 
calculate N. In general, we may write these results as 


oV | OV, a Ea 
where 6 = 1,2 with q! = r, q? = 0, and o = r, 0. 

In the present case, we were able to calculate 0&/0q° explicitly from 
(N.5), as in (N.8). But whatever the nature of the coordinate system, 0€, / 3q? 
is some vector and must be expressible as a linear combination of the basis 
vectors via an expression of the form 


Eq 


205 Tae. (N.10) 


where the repeated index y is summed over as usual (y = r,0). Inserting 
(N.10) into (N.9) and interchanging the ‘dummy’ (i.e. summed over) indices 
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a and y gives finally 


av Ve a , 


'This is a very important result: it shows that, whereas the components of Vin 
the basis & are just V^, the components of the derivative of V are not simply 
0V^9/Oq?, but contain an additional term: the ‘components of the derivative 
of a vector’ are not just the ‘derivatives of the components of the vector’. 
Let us abbreviate 0/0q° to Og; then (N.11) tells us that in the Za basis, 


as used in (N.11), the components of the 0g derivative of V are 
OgV^ +T? BV E= DgV^. (N.12) 


The expression (N.12) is called the ‘covariant derivative’ of V? within the 
context of the mathematics of general coordinate systems: it is denoted (as 

n (N.12)) by DaV? or, often, by V%, (in the latter notation, 05V^ is V%,). 
Thé most important property of DV is its transformation character iei 
general coordinate transformations. Crucially, it transforms as a tensor TẸ 
(see appendix D of volume 1) with the indicated ‘one up, one down’ indices; 
we shall not prove this here, referring instead to Schutz (1988), for example. 
'This property is the reason for the name 'covariant derivative', meaning in 
this case essentially that it transforms the way its indices would have you 
believe it should. By contrast, and despite appearances, 0gV° by oan does 
not transform as a ‘TẸ? tensor, and in a similar way D'^5 is not a *T7;"-type 
tensor; only the combined object DaV* is a 'T7". 

This circumstance is highly reminiscent of ie situation we found in the 
case of gauge transformations. Consider the simplest case, that of U(1), for 
which Dy = ô Y + igA,v. The quantity D,» transforms under a gauge 
transformation in the same way as ~ itself, but O0,» does not. There is thus 
a close analogy between the ‘good’ transformation properties of DgV° and of 
Dy. Further, the structure of D, is very similar to that of DgV°. There 
are two pieces, the first of which is the straightforward derivative, while the 
second involves a new field (T or A) and is also proportional to the original 
field. The ‘i’ of course is a big difference, showing that in the gauge symmetry 
case the transformations mix the real and imaginary parts of the wavefunction, 
rather than actual spatial components of a vector. 

Indeed, the analogy is even closer in the non-Abelian - e.g. local SU(2) 
— case. As we have seen, 0/y'2) does not transform as an SU(2) isospinor 
because of the extra piece involving 0“e; nor do the gauge fields W” transform 
as pure T = 1 states, also because of a OMe term. But the gauge covariant 
combination (^ 4-igr - W” /2)u(3) does transform as an isospinor under local 
SU(2) transformations, the two ‘extra’ 0“e pieces cancelling each other out. 

There is a useful way of thinking about the two contributions to DgV% 
(or Dv). Let us multiply (N.12) by dg? and sum over £ so as to obtain 


DV* = 0gV*dg? +T%,gV7dq". (N.13) 
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V«8V | V. 


FIGURE N.3 " 
Parallel transport of a vector V in a polar coordinate basis. 


The first term on the right-hand side of (N.13) is oe dq? which is just the 
conventional differential dV*, representing the change in V* in moving from 
q? to qf -- dg?: dV* = [V^ (g? + ag?) — V^(gP)]. Again, despite appearances, 
the quantities dV° do not form the components of a vector, and the reason 
is that V°(q? + dg?) are components with respect to axes at q? + dq’, while 
V? (g8) are components with respect to different axes at q?. To form a ‘good’ 
differential DV*, transforming as a vector, we must subtract quantities de- 
fined in the same coordinate system. This means that we need some way of 
‘carrying’ V? (g^) to qÊ + dg?, while keeping it somehow ‘the same’ as it was 
at q? , 

A reasonable definition of such a ‘preserved’ vector field is one that is 
unchanged in length, and has the same orientation relative to the axes at 
q? -- dg? as it had relative to the axes at q? (see figure N.3). In other words, 
V is ‘dragged around’ with the changing coordinate frame, a process called 
parallel transport. Such a definition of ‘no change’ of course implies that 
change has occurred, in general, with respect to the original axes at q?. Let 
us denote by dV“ the difference between the components of V after parallel 
transport to q? 4- dg?, and the components of V at q? (see figure N.3). Then a 
reasonable definition of the ‘good’ differential of V^ would be V%(q? + dg?) — 
(V9 (gP?) + óV^) = dV% — 6V%. We interpret this as the covariant differential 
DV* of (N.13), and accordingly, make the identification 


áV* = —*5V"dg?. (N.14) 


On this interpretation, then, the coefficients T° ; connect the components of 
a vector at one point with its components at a nearby point, after the vector 
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has been carried by ‘parallel transport’ from one point to the other; they are 
often called ‘connection coefficients’, or just ‘the connection’. 
In an analogous way we can write, in the U(1) gauge case, 


Dy = D'dz, = Adda, -ieA"ydz, 
= dy -— y (N.15) 
with 
dy = —ieA" ydr. (N.16) 


Equation (N.16) has a very similar structure to (N.14), suggesting that the 
electromagnetic potential A" might well be referred to as a ‘gauge connection’, 
as indeed it is in some quarters. Equations (N.15) and (N.16) generalize 
straightforwardly for Dy2) and dy 2). 

We can relate (N.16) in a very satisfactory way to our original discussion of 
electromagnetism as a gauge theory in chapter 2, and in particular to (2.83). 
For transport restricted to the three spatial directions, (N.16) reduces to 


y(x) = ieA - dæy(z). (N.17) 


However, the solution (2.83) gives 


Í 
eo 
= 


w(x) = exp c n - A- ae) v(A (N.18) 


replacing q by e. So 


w(a + da) 
dc 
— exp («| Aa) v(A = 0,z + da) 


— oo 


dx T 
= exp g A. ae) exp (eJ A. ae) vV(A = 0,2 + da) 
x —oo 


(1+ieA - dz)exp (ic [. A. ae) (V(A = 0,2) + VU(A = 0, x) - da] 


2 


2 


x 
p(x) +ieA - daw(a) + exp (f A: ae) Vw(A = 0,2) - da, (N.19) 


to first order in da. On the right-hand side of (N.19) we see (i) the change ôy 
of (N.17), due to ‘parallel transport’ as prescribed by the gauge connection A, 
and (ii) the change in v» viewed as a function of æ, in the absence of A. The 
solution (N.18) gives, in fact, the ‘integrated’ form of the small displacement 
law (N.19). 

At this point the reader might object, going back to the é;,é example, 
that we had made a lot of fuss about nothing: after all, no one forced us 
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(5) 


FIGURE N.4 
Parallel transport (a) round a curved triangle on the surface of a sphere (b) 
round a triangle in a flat plane. 


to use the €,,é@ basis, and if we had simply used the ij basis (which is 
constant throughout the plane) we would have had no such ‘trouble’. This 
is a fair point, provided that we somehow knew that we are really doing 
physics in a ‘flat’ space, such as the Euclidean plane. But suppose instead that 
our two-dimensional space was the surface of a sphere. Then, an intuitively 
plausible definition of parallel transport is shown in figure N.4(a), in which 
transport is carried out around a closed path consisting of three great circle 
arcs A > B,B — C,C — A, with the rule that at each stage the vector 
is drawn 'as parallel as possible! to the previous one. It is clear from the 
figure that the vector we end up with at A, after this circuit, is no longer 
parallel to the vector we started with; in fact, it has rotated by 7/2 in this 
example, in which sth of the surface area of the unit sphere is enclosed by 
the triangle ABC. By contrast, the parallel transport of a vector round a 
flat triangle in the Euclidean plane leads to no such net change in the vector 
(figure N.4(b)). 

It seems reasonable to suppose that the information about whether the 
space we are dealing with is ‘flat’ or ‘curved’ is contained in the connection 
I7 5. In a similar way, in the gauge case the analogy we have built up so far 
would lead us to expect that there are potentials A" which are somehow ‘flat’ 
(E = B = 0) and others which represent ‘curvature’ (non-zero E, B). This 
is what we discuss next. 
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FIGURE N.5 
Closed loop ABCD in q! — q? space. 


E 


N.2 Geometrical curvature and the gauge field strength 


tensor 


Consider a small closed loop in our (possibly curved) two-dimensional space 
— see figure N.5 — whose four sides are the coordinate lines q! = a,q! = 
a--óa,q? = b,q? = b + ôb. We want to calculate the net change (if any) in 
dV as we parallel transport V around the loop. The change along A — B is 


q? —b,q! 2a4-óa 
(óV*)Ag = — J T% ,V7dq' 
q2—b,q!—a 
~ —dal® (a, b) V? (a, b) 
to first order in óa, while that along C — D is 


q? =b+ôb,q! =a 
— D T^ AV" dg! 
q?=b+6b,q!=a+6a 


(8V*)op 


II 


II 


2=b+ôb,q! =a 
dal, (a, b + ôb) V? (a, b + ôb). 


2 


Now " 
or yl 
3q? 


T° (a, b + 6b) ~ I, (a, b) + 6b 


and, remembering that we are parallel-transporting V, 
V? (a, b + 6b) e V? (a, b) — T*,,V?6b. 
Combining (N.20) and (N.21) to lowest order, we find 
or% 
3q? 


(5V°) ap + (5V)op ~ badb prie ye 


q? =b+6b,q' =a+da 
+ / ren V?dq!. 
q 


(N.20) 


(N.21) 


(N.22) 


(N.23) 


(N.24) 
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or, interchanging dummy indices y and 6 in the last term, 


a a Ore, a 6 y 
(8V^)Ap + (6V")cp ~ ðaðb 57 =T% a V?. (N.25) 
Similarly, 
a a OT?» a ô y 
(6V“) Bo + (6V%)pa & óaób |— jj T 1% oT | V7, (N.26) 


and so the net change around the whole small loop is 


Ore, are 
(65V°) aBop ~ 9aób | e = 7i +r - M V^. (N27) 


The indices ‘1’ and ‘2’ appear explicitly because the loop was chosen to go 

along these directions. In general, (N.27) would take the form 

Or" = or^, 
Oq? Og? 


(8V?)isop © | +r% T’ g — M V^dA?"  (N.28) 


where d.A?^ is the area element. The quantity in brackets in (N.28) is the 
Reimann curvature tensor R% 5, (up to a sign, depending on conventions), 
which can clearly be calculated once the connection coefficients are known. 
A flat space is one for which all components R^8s = 6 the reader may 
verify that this is the case for our polar basis €;, €ọ in the Euclidean plane. A 
non-zero value for any component of R% 8o means the space is curved. 

We now follow exactly similar steps to calculate the net change in ó« as 
given by (N.16), around the small two-dimensional rectangle defined by the 
coordinate lines zı = a, zı = a+ ĝa, £2 = b, £2 = b+ ôb, labelled as in figure 
N.5 but with q! replaced by xı and q? by x2. Then 


(ow) ap = —ieA! (a, b)w(a, b)da (N.29) 
and 


(óv)cp 


II 


+ieA! (a, b + ôb)y(a, b + ób)óa 
zz ie (4t. b) + 2) [i (a, b) — ieA?(a, b)v (a, b)ób]óa 
2 
c ieA!(a, b)u(a, b)óa 
s [OAc + Al 2 
+ ie PED — ieA (a, b)A (a, b(a) dadb. (N.30) 


Combining (N.29) and (N.30) we find 


1 
(60) aB t (Óv)cp & jes wt eau óaób. (N.31) 
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Similarly, 
; OA? 2 41 42 
(8U) nc + (9U)pA © |-iezy—V — e^ A A^ | óaób, (N.32) 
1 


with the result that the net change around the loop is 


OA! 04? 
x ie | — — —— b. N. 
(óv)ABcD ~ ie (= Dr; ) wad (N.33) 
For a general loop, (N.33) is replaced by 
_ (OAX ðA” 
(5) toop 1€ (= — | dz, dz, 
= -—ieF""dz,dz, (N.34) 


where F"" = 9" A" — 0” A" is the familiar field strength tensor of QED. 

The analogy we have been pursuing would therefore suggest that F#” = 0 
indicates ‘no physical effect’, while F"" Æ 0 implies the presence of a physical 
effect. Indeed, when A” has the ‘pure gauge’ form A" = O"x the associated 
F"" is zero; this is because such an A” can clearly be reduced to zero by 
a gauge transformation (and also, consistently, because (0^0" — O"O")y = 
0). If A" is not expressible as the 4-gradient of a scalar, then F"" z 0 
and an electromagnetic field is present, analogous to the spatial curvature 
revealed by R^ Bo #0. Once again, there is a satisfying consistency between 
this ‘geometrical’ viewpoint and the discussion of the Aharonov-Bohm effect 
in Section 2.6. As in our remarks at the end of the previous section, and 
equations (N.17)-(N.19), equation (2.83) can be regarded as the integrated 
form of (N.34), for spatial loops. Transport round such a loop results in a 
non-trivial net phase change if non-zero B flux is enclosed, and this can be 
observed. 

From this point of view there is undoubtedly a strong conceptual link 
between Einstein's theory of gravity and quantum gauge theories. In the 
former, matter (or energy) is regarded as the source of curvature of space- 
time, causing the space-time axes themselves to vary from point to point, 
and determining the trajectories of massive particles; in the latter, charge is 
the source of curvature in an ‘internal’ space (the complex w-plane, in the 
U(1) case), a curvature which we call an electromagnetic field, and which has 
observable physical effects. 

The reader may consider repeating, for the local SU(2) case, the closed- 
loop transport calculation of (N.29)-(N.33). For this calculation, the place 
of the Abelian vector potential is taken by the matrix-valued non-Abelian 
potential A" = 7/2- A". It will lead to the expression for the non-Abelian 
field strength tensor as calculated in section 13.1.2. 


O 


Dimensional Regularization 


After combining propagator denominators of the form (p? — m? + ice)! by 
Feynman parameters (cf (10.40)), and shifting the origin of the loop momen- 
tum to complete the square (cf (10.42) and (11.16)), all one-loop Feynman 
integrals may be reduced to evaluating an integral of the form 


d?k 1 
Ta(A,n) = | oe (0.1) 
or to a similar integral with factors of k (such as k,,k,) in the numerator. We 
consider (O.1) first. 

For our purposes, the case of physical interest is d = 4, and n is commonly 
2 (e.g. in one-loop self-energies). Power-counting shows that (O.1) diverges 
as k — oo for d > 2n. The idea behind dimensional regularization (°t Hooft 
and Veltman 1972) is to treat d as a variable parameter, taking values smaller 
than 2n, so that (O.1) converges and can be evaluated explicitly as a function 
of d (and of course the other variables, including n).! Then the nature of the 
divergence as d — 4 can be exposed (much as we did with the cut-off pro- 
cedure in section 10.3), and dealt with by a suitable renormalization scheme. 
The crucial advantage of dimensional regularization is that it preserves gauge 
invariance, unlike the simple cut-off regularization we used in chapters 10 and 
11. 

We write 


1 a\"" f dik 1 
^ uas) lona (0.2) 


The d dimensions are understood as one time-like dimension k°, and d — 1 
spacelike dimensions. We begin by 'Euclideanizing' the integral, by setting 
k? = ik* with k* real. Then the Minkowskian square k? becomes —(k*)? —k? = 
—k£, and d¢k becomes id^kg, so that now 


. n—1 
—L ð d? kg 1 
I; = —— | — pe teh SS ; 
7 (n-1) (sx) bear (kà + A)’ ae) 
the ‘ie’ may be understood as included in A. The integral is evaluated by 


1We concentrate here on ultraviolet divergences, but infrared ones (such as those met in 
section 14.4.2) can be dealt with too, by choosing d larger than 2n. 
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introducing the following way of writing (kz + A)7?: 


(kB +A)! =} dfe- 85A). (O.4) 
0 


which leads to 


1 Ó Uer 9e dd ky 2 
= 1 E oek +A) 
ü-um(m) ff aay ME 


The interchange of the orders of the 8 and kg integrations is permissible since 
I4 is convergent. The kg integrals are, in fact, a series of Gaussians: 


d 
[aon = e BA II f ayer" 
T LU 


Hence 


7 -i 1 xs Ee 
fs — mp (sx) [ is 


—i 2 .1n—1 
= aap aman | 29 hot m. P 


The last integral can be written in terms of Euler’s integral for the gamma 
function T(z) defined by (see, for example, Boas 1983, chapter 11) 


T(z) = ye a ter das (O.8) 


Since T(n) = (n — 1)!, it is convenient to write (O.8) entirely in terms of T 
functions as 
. (71)? T(n- 4/2) , (a/2)- 

fp ia c N OA 0.9 

4 "Um? r(n) (0,9) 

Equation (O.9) gives an explicit definition of Iq which can be used for any 

value of d, not necessarily an integer. As a function of z, T(z) has isolated 

poles (see appendix F of volume 1) at z = 0, —1, —2,.... The behaviour near 
z = 0 is given by 


pays L EE (0.10) 


where y is the Euler-Mascheroni constant having the value y z 0.5772. Using 


zl(z) 2 l'(z +1), (O.11) 
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we find the behaviour near z = —1: 
—1 
T(-l+z) = T(z) 
l-z 
1 
= = ole OU (0.12) 
similarly 
ely 233 


Consider now the case n = 2, for which T(n — d/2) in (O.9) will have a 
pole at d= 4. Setting d = 4 — e, the divergent behaviour is given by 


2 
T(2—4/2)- c — ^ 4 O(c) (0.14) 
from (0.10). Ig(A, 2) is then given by 
i 


When A-*/ and (47)~?+*/? are expanded in powers of e, for small e, the 
terms linear in e will produce terms independent of e when multiplied by the 
e71 in the bracket of (O.15). Using z€ ~ 1 -- elInz + O(c?) we find 


i 


L(A 2) = Gp 


E -y+ Inár — nA +0(0] ! (0.16) 


Another source of e-dependence arises from the fact (see problem 15.7) that 
a gauge coupling which is dimensionless in d = 4 dimensions will acquire mass 
dimension u*/? in d = 4 — e dimensions (check this!). A vacuum polarization 
loop with two powers of the coupling will then contain a factor u*. When 
expanded in powers of e, this will convert the In A in (O.16) to In(A/p?). 

Renormalization schemes will subtract the explicit pole pieces (which di- 
verge as € — 0), but may also include in the subtraction certain finite terms as 
well. For example, in the ‘minimal subtraction’ (MS) scheme, one subtracts 
just the pole pieces; in the ‘modified minimal subtraction’ or MS (‘emm-ess- 
bar’) scheme (Bardeen et al. 1978) one subtracts the pole and the *—4 4- In 47’ 
piece. 

The change from one scheme ‘A’ to another ‘B’ must involve a finite renor- 
malization of the form (Ellis et al. 1966, section 2.5) 


aP = af (1+ Ao? 4 ...). (0.17) 


Note that this implies that the first two coefficients of the 6 function are 
unchanged under this transformation, so that they are scheme-independent. 
Subsequent coefficients are scheme-dependent, as is the QCD parameter A 
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introduced in section 15.3. From (15.54) the two corresponding values of A 
are related by 


An) 1 o dz 
ln | — = = — O.18 
(= 2 aA (|q2|) Boz? (14+...) ( ) 
Ai 
Emo O.19 
2o ( ) 


where we have taken |g?| — oo in (O.18) since the left-hand side is indepen- 
dent of |g?|. Hence the relationship between the A's in different schemes is 
determined by the one-loop calculation which gives A4 in (O.19). For example, 
changing from MS to MS gives (problem 15.8) 


As = At gexp(In 4r — y), (0.20) 


as the reader may check. 
Finally, consider the integral 


dik — khk" 
uv = 
D (Ans f a TENES (0.21) 


From Lorentz covariance this must be proportional to the only second-rank 
tensor available, namely g^": 


THY = Agh”, (0.22) 


The constant ‘A’ can be determined by contracting both sides of (0.21) with 
guv, using g""g,,, = d in d-dimensions. So 


P Ji dk k? 
~ dJ (2x)! (k2 — A + ie)” 


T r^f M 


|. d(-1)? AG/2-n^* (—T(n—1—d/2) | T(n — d/2) 
= Qmm d { Ta-1 '* IG \ 
i(-1y" AME T(n — 1 — d/2) 
Wan d pup CONS 
i(—41)5-1AG/2-"7 1 T(n — 1 — d/2) 
(47)1/2 2 T(n) (pan 


Using these results, one can show straightforwardly that the gauge-non-invariant 
part of (11.18) — ie. the piece in braces — vanishes. With the technique 
of dimensional regularization, starting from a gauge-invariant formulation of 
the theory the renormalization programme can be carried out while retaining 
manifest gauge invariance. 


E 


Grassmann Variables 


In the path integral representation of quantum amplitudes (chapter 16) the 
fields are regarded as classical functions. Matrix elements of time-ordered 
products of bosonic operators could be satisfactorily represented (see the dis- 
cussion following (16.79)). But something new is needed to represent, for 
example, the time-ordered product of two fermionic operators: there must 
be a sign difference between the two orderings, since the fermionic operators 
anticommute. Thus it seems that to represent amplitudes involving fermionic 
operators by path integrals we must think in terms of ‘classical’ anticommut- 
ing variables. 

Fortunately, the necessary mathematics was developed by Grassmann in 
1855, and applied to quantum amplitudes by Berezin (1966). Any two Grass- 
mann numbers 01,05 satisfy the fundamental relation 


0105 + 050, = 0, (P.1) 


and of course 

6; = 495-0, (P.2) 
Grassmann numbers can be added and subtracted in the ordinary way, and 
muliplied by ordinary numbers. For our application, the essential thing we 
need to be able to do with Grassmann numbers is to integrate over them. 
It is natural to think that, as with ordinary numbers and functions, integra- 
tion would be some kind of inverse of differentiation. So let us begin with 
differentiation. 


We define 9(a6) 
a 
EEN D P. 
00 ^ (P3) 
where a is any ordinary number, and 
a (P.4) 
001 172] — 725 š 
then necessarily 
o 
— (0102) = — 01. P.5 
85; 102) 1 (P.5) 


Consider now a function of one such variable, f(0). An expansion of f in 
powers of 0 terminates after only two terms because of the property (P.2): 


f(0) =a+00. (P.6) 
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"s af (0) 

36 = b, (P.7) 
but also af 

ag ^ 0 (P.8) 


for any such f. Hence the operator 0/00 has no inverse (think of the matrix 
analogue A? = 0: if A^! existed, we could deduce 0 = A~!(A?) = (A1 4)A = 
A for all A). Thus we must approach Grassmann integration other than via 
an inverse of differentiation. 

We only need to consider integrals over the complete range of 0, of the 
form 


/ dof (0) = J dé(a + b0). (P.9) 


Such an integral should be linear in f; thus it must be a linear function of 
a and b. One further property fixes its value: we require the result to be 
invariant under translations of 0 by 0 — 0 +n, where 7 is a Grassmann 
number. This property is crucial to manipulations made in the path integral 
formalism, for instance in ‘completing the square’ manipulations similar to 
those in section 16.3, but with Grassmann numbers. So we require 


fwaw 5 fola +00). (P.10) 


This has changed the constant (independent of 0) term, but left the linear 
term unchanged. The only linear function of a and b which behaves like this 
is a multiple of b, which is conventionally taken to be simply b. Thus we define 


EC +0) — b, (P.11) 


which means that integration is in some sense the same as differentiation! 

When we integrate over products of different 0's, we need to specify a 
convention about the order in which the integrals are to be performed. We 
adopt the convention 


that is, the innermost integral is done first, then the next, and so on. 

Since our application will be to Dirac fields, which are complex-valued, 
we need to introduce complex Grassmann numbers, which are built out of 
real and imaginary parts in the usual way (this would not be necessary for 
Majorana fermions). Thus we may define 


1 
m 


1 


ia V 


(01 + i05), ~* = —= (01 — i65), (P.13) 
and then 


—ididy* = d6,d65. (P.14) 
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It is convenient to define complex conjugation to include reversing the order 
of quantities: 


(vx) = x*y*. (P.15) 


Then (P.14) is consistent under complex conjugation. 

We are now ready to evaluate some Gaussian integrals over Grassmann 
variables, which is essentially all we need in the path integral formalism. We 
begin with 


ETT = J [ 99 - wen 


" J [arawa ewe =. (P.16) 


Note that the analogous integral with ordinary variables is 


i; fi dedy e- 996^ +¥°)/2 = 25 Jp, (P.17) 


The important point here is that, in the Grassman case, b appears with a 
positive, rather than a negative, power. On the other hand, if we insert a 
factor »* into the integrand in (P.16), we find that it becomes 


J [arwa = ff aao = (P.18) 


and the insertion has effectively produced a factor b~!. This effect of an 
insertion is the same in the ‘ordinary variables’ case: 


dzdy(z? + y2)/2e- 90^ +9°)/2 = 95 Jy. P.19 
y 


Now consider a Gaussian integral involving two different Grassmann vari- 
ables: 


Jj dif dii dvd» e- 9 Mv, (P.20) 


y= ( 2 ). (P.21) 


and M is a 2 x 2 matrix, whose entries are ordinary numbers. The only 
terms which survive the integration are those which, in the expansion of the 
exponential, contain each of wi, V1, wz and v» exactly once. These are the 
terms 


where 


5 [Mu Mast va + Us) + MaMa UT Us + V]. 
(P.22) 
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To integrate (P.22) conveniently, according to the convention (P.12), we need 
to re-order the terms into the form v5 v1v1; this produces 


(Mii M35 — Mi2 M21) (V2v5 Vi V1), (P.23) 


and the integral (P.20) is therefore just 


/ J dyždydyždy e- 9 M% = det M. (P.24) 


The reader may show, or take on trust, the obvious generalization to N in- 
dependent complex Grassmann variables V1, Y2, Va, ..., Wn. This result 
is sufficient to establish the assertion made in section 16.4 concerning the 
integral (16.90), when written in ‘discretized’ form. 

We may contrast (P.24) with an analogous result for two ordinary complex 
numbers 21, 22. In this case we consider the integral 


J | 12nasian 7t, (P.25) 


where z is a two-component column matrix with elements zı and zo. We take 
the matrix H to be Hermitian, with positive eigenvalues bı and b». Let H be 
diagonalized by the unitary transformation 


De) Z 


dz dz, = detU dzıd22, (P.27) 


with UUt = I. Then 


and so 
dzjdzj dz5jdz7 = dzıdzïřdz2d2ž, (P.28) 


since |detU|? = 1. The integral (P.25) then becomes 
f adate f achaare tt (P.29) 


the integrals converging provided bı,b2 > 0. Next, setting zi = (zı + 
iy1)/ V2, zo = (x2 + iye)/V2, (P.29) can be evaulated using (P.17), and the 
result is proportional to (b1b2)71, which is the inverse of the determinant 
of the matrix H, when diagonalized. Thus — compare (P.16) and (P.17) - 
Gaussian integrals over complex Grassmann variables are proportional to the 
determinant of the matrix in the exponent, while those over ordinary complex 
variables are proportional to the inverse of the determinant. 

Returning to integrals of the form (P.20), consider now a two-variable 
(both complex) analogue of (P.18): 


J awtavraugavs vg e Mv. (P.30) 
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This time, only the term «v1» in the expansion of the exponential will survive 
the integration, and the result is just — Mi». By exploring a similar integral 
(still with the term 15) in the case of three complex Grassmann variables, 
the reader should be convinced that the general result is 


II / dif dvi yf e" MY = (Mu detM. (P.31) 


With this result we can make plausible the fermionic analogue of (16.87), 
namely 


[| DYDY v(zi)i(zz)exp[- f dienyl 9 — my]. 
T DýDyexp J dzepa p- m] ` 
(P.32) 
note that v» and ọ%* are unitarily equivalent. The denominator of this expres- 
sion is! det(i 9 — m), while the numerator is this same determinant multiplied 
by the inverse of the operator (i 9—m); but this is just (f—m) ! in momentum 
space, the familiar Dirac propagator. 


(QUT ((z1)9(z2)) |Q) = 


l'The reader may interpret this as a finite-dimensional determinant, after discretization. 
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Feynman Rules for Tree Graphs in QCD and 
the Electroweak Theory 


Q.1 QCD 
Q.1.1 External particles 
Quarks 


The SU(3) colour degree of freedom is not written explicitly; the spinors have 
3 (colour) x 4 (Dirac) components. For each fermion or antifermion line 
entering the graph include the spinor 


u(p,s) or  v(p,s) (Q.1) 
and for spin-4 particles leaving the graph the spinor 
ü(p,s) or  w(p,s), (Q.2) 
as for QED. 
Gluons 


Besides the spin-1 polarization vector, external gluons also have a ‘colour 


polarization’ vector a^(c = 1,2,...,8) specifying the particular colour state 
involved. For each gluon line entering the graph include the factor 
ey (k, A) a^ (Q.3) 
and for gluons leaving the graph the factor 
E(k, M) a**. (Q.4) 


.1.2 Propagators 
pag 
Quark 
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Gluon 


k2 


for a general £ gauge. Calculations are usually performed in Lorentz or Feyn- 
man gauge with € = 1 and gluon propagator equal to 


00000000 — "eru Du (Q.7) 


00000000 — 3 (^; paS ot) 5a (Q.6) 


k2 
Here a and b run over the 8 colour indices 1,2,...,8. 
Q.1.3  Vertices 
La 
“igs Yu 
uka Asc 


—9sfabelGuv (ki — k2)x + gva (ka — ka), + gau (ks — k1)s] 


vV,k b 


ukpa A OW pak 


—ig? [fase fcac gu gvp om Gup9vA) + fade focelJuvgap T guX9vp) + 
face fave (GupIvr m guv 9p) 


It is important to remember that the rules given above are only adequate 
for tree diagram calculations in QCD (see section 13.3.3). 


E 


Q.2 The electroweak theory 


For tree graph calculations, it is convenient to use the U gauge Feynman rules 
(sections 19.5 and 19.6) in which no unphysical particles appear. These U 
gauge rules are given below for the leptons l = (e, u, T), vi = (Ve, Vu, Vr); for 
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the t3 = +1/2 quarks denoted by f, where f = u,c,t; and for the t3 = —1/2 


CKM-mixed quarks denoted by f" where f' = d’,s’,b’. 
Note that for simplicity we do not include neutrino flavour mixing. 


Q.2.1 External particles 
Leptons and quarks 


For each fermion or antifermion line entering the graph include the spinor 
u(p,s) or  v(p,s) (Q.8) 
and for spin-2 particles leaving the graph the spinor 
u(p,s) or  o(p',s. (Q.9) 


Vector bosons 


For each vector boson line entering the graph include the factor 
eu(k, A) (Q.10) 


and for vector bosons leaving the graph the factor 


eK, X). (Q.11) 
Q.2.2 Propagators 
Leptons and quarks 
i .p+m 
Vector bosons (U gauge) 
W*,z? i 
WSDPPYIGS = (~<a + kyky / m3) (Q.13) 
k2 — M2 H H V 


where ‘V’ stands for either ‘W’ (the W-boson) or ‘Z’ (the Z9). 


Higgs particle 


— ete = r (Q.14) 
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Q.2.3 Vertices 
Charged current weak interactions 


Leptons 


Quarks 


1-8 1—ys 
EV MI 5) Veg 


Neutral current weak interactions (no neutrino mizing) 


Fermions 
=i f 1—»ys f i+ 
pmi Gi 2: + CR 2:). 
where 
cf = tl — sin? 6wQ; (Q.15) 
ch = — sin? 6wQy, (Q.16) 


and f stands for any fermion. 


Vector boson couplings 


(i) Trilinear couplings: 
yWTW7 vertex 


Q.2. The electroweak theory ATT 


vk 


Wiky We 


ielgu (ki ka), gau (Ko ky)v + Gilkey = ki)4] 
Z0W*W- vertex 


Zn 


igcosOw[gux (ki — ko), + gau (ko — k3)v + guv (ks — kı )al] 


(ii) Quadrilinear couplings: 


-ig? cos? Ow (29e89,v — gougBv — Jav 9811) 


Wp wry 
+ = 
w'a wp 


ig? (294o9v8 — JußJav — Guv9oB) 
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Higgs couplings 


(i) Trilinear couplings 
HW*W- vertex 


HZ°Z° vertex 


Day SITU Za 


ig 
cos Ow 


Mzgva 


Fermion Yukawa couplings (fermion mass m) 


Hi 


—122- 
2 Mw 
Trilinear self-coupling 
Hi 
H H 
—i 3mig 
2Mw 


(ii) Quadrilinear couplings: 
HHW*W- vertex 
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2. dca AR 


HHZZ vertex 


1 
ig 
2 cos? Ow Inv 


Quadrilinear self-coupling 


H H 
E i3mZ g? 
4M2 
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Plate I 
Comparison between measurements of a, and the theoretical prediction, as a 
function of the energy scale Q (Bethke 2009). (See figure 15.5 on page 129.) 
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Plate II 


The light hadron spectrum of QCD, from Dürr et al. (2008). (See figure 16.12 
on page 190.) 
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Plate III 

Constraints in the f,7 plane. The shaded areas have 95% CL. [Figure repro- 
duced, courtesy Michael Barnett for the Particle Data Group, from the review 
of the CKM Quark-Mixing Matrix by A Ceccucci, Z Ligeti and Y Sakai, sec- 
tion 11 in the Review of Particle Physics, K Nakamura et al. (Partcle Data 
Group) Journal of Physics G 37 (2010) 075021, IOP Publishing Limited.] 
(See figure 20.11 on page 323.) 
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Plate IV 

(a) Number of nf = —1 candidates in the signal region with a B? tag (Ngo) and 
with a B? tag (Ngo), and (b) the measured asymmetry (Ngo — Ngo)/(Ngo + 
Ngo), as functions of t; (c) and (d) are the corresponding distributions for 
the np = +1 candidates. Figure reprinted with permission from Aubert et al. 
(BaBar Collaboration) Phys. Rev. Lett. 99 171803 (2007). Copyright 2007 
by the American Physical Society. (See figure 21.7 on page 341.) 
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